m 



Type Checking in VimVal 

by 
Bradley C. Kuszmaul 

Submitted in partial fulfillment of 
the requirements for the degree of 

Bachelor of Science 
. at the 
Massachusetts Institute of Technology 

June 1984 

© Bradley C. Kuszmaul 1984 

The author hereby grants to M.I.T. permission to reproduce and to 

distribute copies of this thesis document in whole or in part 

Signature of Author 

Department of Electrical Engineering and Computer Science 

10 May 1984 

Certified by 

Jack B. Dennis 
Thesis Supervisor 

Accepted by 

John Guttag 
Chairman, Departmental Committee 



Type Checking in VimVal 

by 

Bradley C. Kuszmaul 

Submitted to the Department of Electrical Engineering and Computer 

Science on 20 June 1984 in partial fulfillment of the requirements for 

the Degree of Bachelor of Science. 

Abstract 

A type system is developed for the revised version of the Val programming 
language (VimVal) which has the following features: 

1. Type Inference: allows programs to be written with incomplete type 
specifications. The type checker infers the types of the expressions from 
their context 

2. Polymorphism: allows modules to be written which operate on more 
than one type, performing analogous operations on different types of 
data. 

3. Higher order functions: functions are first class data in VimVal. 

4. Recursive types: a type may refer to itself. 

A theory of types is developed which applies to a large class of programming 
languages, including VimVal. First the notion of type is defined, then the 
interaction between types and programs is described, with a definition of type 
correctness. Type correctness is shown to be well defined and decidable, and a type 
checking algorithm is given which performs type checking for VimVal. 
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Chapter One 
Introduction 



Val (Value-Oriented Algorithmic Language), developed by Ackerman and Dennis, 
of M.l.T.'s Computation Structures Group [1], explored static data flow 
architecture [5] for a side-effect free language. Side-effect free languages implement 
functions which, when given a particular set of arguments, always return the same 
result (as opposed to languages which allow side effects, and the result of calling a 
function depends on the state of the environment as well as the explicit arguments). 
Such languages are sometimes called "functional" because they implement 
mathematical functions. Functional languages are well suited to highly parallel 
computers because changing the order in which different parts of a program are run 
(or running them in parallel) does not change the semantics of the program [7, 3]. 
The Computation" Structures Group is now developing a new implementation of a 
revised Val, based on an abstract data flow machine called the K\L Interpretive 
Machine (VIM), which executes data flow instructions directly. The revised version 
of Val is called VimVal. The original Val does not support polymorphism, 
recursive data types, recursive functions, higher order functions, or type inference. 
Because it was not expected that the static architecture would implement proper 
function application using data flow, Val function calls are actually implemented 
by compile time "macro expansion", precluding higher order functions in general, 
and recursive functions in particular. Val is a strongly typed language which 
requires that the type of every variable and formal argument be completely and 
explicitly specified. The Vim abstract machine includes mechanisms for function 
application, and the Computation Structures Group is developing an 
implementation of VimVal. Since higher order functions introduce extra 



complexity, we decided to rework the type system for VimVal Several desired 
features for the type scheme of VimVai. were proposed, most of which boiled down 
to: ease of use for the programmer. Ease of use has at least two components: 
"writeability" and "readability": it is easier to write programs (at least it involves 
fewer characters to write a program) in a language which requires a minimum of 
symbols, while it is typically easier to read programs written in a language which 
requires the programmer to add redundant information to a program. Thus "Ease 
of use" has different meanings for different people. Here is a set of criteria for 
evaluating the ease of use of a type system. 

- The type rules must be easy to remember, and express: they should be 
simple and consistent. 

- The programmer should not be required to write a lot of extra symbols 
just to facilitate type checking. "A lot" is subjective: Some 
programmers like to explicitly specify types, and some programmers 
find that requiring such type specification hinders them. 

- The language should be strongly typed, so that no type errors can occur 
at run time, and so that no type information needs to be represented at 
run time. 

To meet these goals, we have decided to incorporate type inference into VimVal. 
Type inference allows the programmer to write a program with a minimum of type 
declarations. Most types can be deduced from their context, for example the type of 
the constant 3.1415 must be REAL in VimVal, and multiplication of a REAL value 
by some variable x would mean that x must also be REAL. The VimVal compiler 
automatically determines the type of every expression, or gives an error saying that 
some expressions are ambiguously typed (i.e. expressions which have more than one 
possible type), or overconstrained (i.e. expressions which have no possible type). 
The type checking algorithm guarantees that no type errors will occur at run time. 
We adopt the strategy that the programmer should be required to write a minimum 
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number of extra symbols to facilitate type checking, while allowing a programmer to 
optionally add extra type information to a program. We will discuss how well our 
type inference system meets our goals in the conclusion of this paper. 

VimVal has the following additional features which improve the expressive power 
of the language, while adding some new difficulties to type inference that have not 
been covered by [16, 15, 14]. 

Polymorphism Allows programmers to write functions which perform analogous 
operations on different types of data. One example of a built in 
polymorphic function is ARRAY- LIMH, which maps from any 
array to an integer. Polymorphism and type inference are loosely 
coupled in VimVal because we allow any type to be explicitly 
written, thus we need a way to denote polymorphic types. The 
main restriction on polymorphism is that a formal argument to a 
function can not be used polymorph ically, only free variables can 
be used polymorphically. 

Recursive data types 

Recursive types are allowed. In fact any type that can be written 
is allowed. Recursive types are not the same as recursive data. It 
is not possible to construct a recursive data object in VimVal 
because VimVal requires that all data objects be "semantically" 
constructed after their components are constructed. (There are 
two "exceptions" to this rule. It is possible for a function to 
operate on a copy of itself, but the circularity involved is very 
stylized, and the functions are not actually being constructed with 
self-references. VimVal has "early completion structures" [4], 
which have certain advantages which do not effect the fact that 
recursive data can not be built in VimVal.) 

Higher order and recursive functions 

Functions are first class data in VimVal: functions can be 
passed to and returned from functions, and functions can be used 
as parts of structures. Recursive functions are a special case of 
higher order functions. All recursive functions are defined to 
have the same semantics as a program written with explicit 
function arguments to replace recursion. (In a language with 
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higher order functions, explicitly specifying the type of a function 
can be troublesome. See {10] for a discussion of this.) 

Previous Work , ^ 

Semantics of Types and Type Checking 

Much work has been done recently on types. Scott [21] and McCracken [14] view 
types as retracts of the universal domain (e.g. special functions on the set of all 
objects which can be represented using strings of bits). Milner[16] views types as 
ideals (which is a special set of objects meeting certain closure conditions). 
Donahue [6] and Demers [2] claim that types are sets of operations (as opposed to 
sets of objects). This approach is contrasted with the algebraic approach, where any 
particular type is specified by its algebraic properties. We unify some of these 
views, and following Solomon [22], we see types as sets of objects with certain 
restrictions. 

Type Inference, Polymorphism and Undecidability 

Langmack[8] showed that two of VimVal's features, type inference and 
polymorphism, can combine to make the type correctness of a program an 
undecidable problem. Langmack showed that by either requiring all formal 
arguments to be "monomorphic" (i.e. the arguments must have exactly one type), or 
requiring all formal arguments to be explicitly typed, the undecidability can be 
avoided. Our solution to this problem is to require all formal arguments to have 
exactly one type, i.e. formals must be "monomorphic" [16] at run-time. This rules 
out certain programs, but we believe, with the support of M ilner [16], that most 
useful programs have the property that all their formals are monomorphic anyway. 

Type Inference Algorithms 

Solomon [22] implicitly described a type checking algorithm for certain kinds of 
languages, where types can be described by regular sets, and the type declarations 
are complete and explicit. This thesis will extend Solomon's work to embrace type 
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inference. (Also relevant is the work on type equivalence for types in Algol68 (20], 
which uses finite state machines to perform comparisons of types, but we are not 
directly concerned with such comparisions.) 

Peacock [19] designed a type checking algorithm for VimVal based on constraint 
propagation through a graph representing a VimVal program. As Peacock pointed 
out, his algorithm was driven by side effects (which is not aesthetically pleasing to a 
group working on a purely applicative language such as VimVal), lacked a 
correctness proof, and was not implemented. This diesis corrects and extends 
Peacock's work by presenting a type checking algorithm, proving it correct, and 
supplying an implementation of the algorithm. 

Overview 

Our work involves type inference, and we argue that the sets of objects that are of a 
given type are in one to one correspondence with the sets of operations that define a 
type. We note an isomorphism between sets of restrictions and certain sets of 
objects: A given set of restrictions completely and uniquely describes a type, and a 
type completely and uniquely describes certain sets of objects. We go on to use that 
isomorphism between the restrictions and our intuitive understanding of types, to 
define types, because the restrictions are easy to formalize. The types then have 
certain algebraic properties (those of regular sets) which are dependent on the 
restrictions placed on them by a programming language. 

We are interested in applying the algebraic properties of types directly to implement 
the type checker, falling closer to Milner [16] and Scott [21] who are modeling type 
checking, than the algebraists who are modeling the type objects. 

Synopsis 

Chapter 2 defines type in terms of regular sets and finite state automata: types are 
regular sets with a certain decidable property. Chapter 3 describes the interaction 
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between types and programs, defining type assignments. Chapter 3 goes on to define 
type-correctness in terms of the number of possible type assignments, and shows that 
type-correctness is well defined, and decidable, and that the type assignment for a 
given program is computable. Chapter 4 describes the application of our type 
checking system to VimVal In conclusion we will examine the type system in 
VimVal, and compare it with our ease-of-use goals. 
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Chapter Two 
Types 



The goal of this chapter is to define the notion of type rigorously. We discuss our 
intuitive notions of types, and how well they fit some currently available 
programming languages. Then using examples from a dialect of LISP, we motivate 
several definitions, which lead to a definition of type-systems and types, which 
formalize our intuition. A type is a description of a set of objects, which have a 
certain property (the type of the objects). The description can be written as a regular 
expression, thus types are isomorphic to regular sets. 

2.1 A Discussion of Type Checking 

Types are easy to use, but difficult to describe. Intuitively, type checking is 
something which can catch certain programming errors (type errors), such as adding 
an integer to a string, or using an array as if it were a function. Many LISP 
implementations provide run time type checking, which detects type errors when 
they happen, This approach is not robust because it is difficult to determine when 
all the type errors in a program have been removed. Another approach, which we 
take, is that programs are checked statically for type correctness. In order to 
perform such static type checking, we traditionally have to put up with a loss of 
notational convenience: we may have to add extra symbols to a program to help the 
type checker, or the extra restrictions required for static type checking might mean 
that we are not be able to express a program in the way we want to. Another 
possibility is that the type checking system might not find all type errors (e.g. the lint 
program on UNIX does some type checking on C programs, but it is not guaranteed 
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to find all type errors.) It is difficult to "retrofit" a programming language with 
static type checking because it is often impossible to perform complete static type 
checking. (In LISP the property that cdr of nil is never taken can not be statically 
checked, and in C it is not possible to statically check that a pointer value actually 
points to a valid address.) 

Our type theory will follow Leivant [9] and Solomon [22], who model types as 
structural conditions on data objects: given a data object O, and a type T, it is 
possible to decide whether O is of type T by examining the structure of O. This 
approach means that types are sets of objects. In this case, Tis a description of the 
possible "shapes" of 0. We specifically follow Solomon, and claim that T describes 
a regular set of paths, where a path is a sequence of symbols in some alphabet 
(called the selectors) which corresponds to a legal sequence of operations on object 
O. This approach means that types are isomorphic to regular sets, and everything we 
want to know about a type can be rephrased in terms of regular sets. 

2.2 The Properties of Types 

Our goal is to define type rigorously. In order to. do this we need to deal with some 
of the restrictions that we intuitively associate with types (for example no object is 
both an integer and a real, and arrays have a "subtype", but integers do not) First 
we will describe selectors, then paths. Then we will discuss the restrictions, leading 
to the definition of a type-system. Finally we will define type. 

We will use LISP examples in this chapter, even though the types of LISP do not 
necessarily match the types of VimVal. We use the words "path" and "selector" 
informally to motivate our definitions, which appear below. The dialect of LISP 
that our examples will use has two base types: 
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Integers The only selector for an integer is INT. There is only one path 

from an integer, and that is <INT>. 

Cons cells Cons cells have a CAR and a CDR, so the selectors for a cons cell 

are CAR or CDR. All paths from a cons cell start with CAR or 
CDR. Cons cells can be built with the CONS function. There is 
a special cons-cell called NIL, which has CAR and CDR both 
NIL. The LIST operator builds a list of cons cells in the standard 
way, ending with a NIL. For example: 

(LIST X Y Z) = def (CONS X (CONS Y (CONS Z NIL))) 

We will be a little sloppy with the type of NIL in our examples, 
because NIL is a "polymorphic" value (it could be an empty 
LIST of anything), and we have not developed the tools to 
discuss NJL's type. 

Paths for LISP are sequences with elements in {/NT, CAR, CDR}. This set is called 

the set of selectors for LISP. 

Notation: The set of selectors for a program, denoted 2, is some finite set, 
which is dependent on the program being type checked. 

Elements of 2 will be written in uppercase italics, e.g. INT and CAR. 

Notation: A path is a sequence, with each element of the sequence in 2. 
Paths are possibly infinitely long. 

The length of a path x is denoted |x|. 

If x is a path with |x| >/, then x. is the ith element of x. The first element 
ofxisxj. 

We write finite paths with angle brackets: x=<INT, CAK> is a path with 
x 1 = INT and x 2 = CAR. The symbol <> denotes the path of length zero 
(the empty path). 

Paths can be concatenated: if x and y are paths, then z= x°y is a path, 
where if x is infinite then z-x, otherwise z.= x. for / € {l,..,|x|}, and 

Z W+ r y t for ^ finite ' € tt—M. 
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The words tuple, and string, are often used for things which are similar to paths, but 
typically tuples and strings are finite in length. 

Consider the LISP value, O, generated by 

(CONS I 2). 
Here, is a cons cell containing an integer in both its car and its cdr, the set of paths 
for is { <CAR, JNT>, <CDR, 1NT> }, and this set defines the "type" of O (see 
Figure 2-1). 




CD* 




CAft. 



Figure 2-l:(CONS 1 2) Cell, with paths: { <CAR, INT>, <CDR, INT> } 



The previous example describes a type which is a finite set of finite paths. The next 
example illustrates a type which is an infinite set of infinitely long paths. Consider 
the type equation T = CONS[T,T|. The paths for this type are infinitely long, and 
consist of any sequence of CARs and CDRs. 
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For the next example (which will give an example of a type which is an infinite set, 
all the elements of which are finite except for one) we need a few standard 
definitions, adapted from [11 J. We will also need the following definitions to define 

type. 

Definition 2-1: If A and B are sets of paths, then the composition of A and 
B, is 

r{ A°B = def \a°p\o<lA,p<iB}, 

where a°p is the concatenation of path a and path p. 

The definition of concatenation of paths automatically takes care of the case where 
some of the elements of A or B are infinite. 

We want to compose /copies of a set of paths, A, where /can be a finite integer, or it 

can be oo. The case of a finite integer is adapted directly from [11], while we need 

an extra definition to define the case of /infinite. 

Definition 2-2: If A is a set of paths, and / is a finite integer, then A 1 is 
defined recursively: 

- A G = de j. { <> } (i.e. the empty path, not 0) 

-A^OO) ^ def AoA hl 

Definition 2-3: A path a is an initial segment of a path y if there is some 
path p, such that o°p — y. 

Definition 2-4: If A is a set of paths, then 
A 00 = de j. { cr | V /<oo, 3 p € A*, such that p is an initial segment of a }. 



If A is a set of paths, then A 1 is the set of paths which are made by concatenating / 

elements of A together. /I 00 is the set of paths which are made by concatenating an 

infinite number of elements of A together. 

Definition 2-5: The Kleene star operator on sets, written , denotes the 
operation 

A =<fr/ u i€{0,...,oo} A - 
19 



Intuitively, A is the set of all paths which are concatenations of zero or more 
elements of A. Note that we allow an infinite concatenation of elements of A. 
Definition 2-6: The Kleeneplus operator on sets, written + , is 

A -def u \€{\ 00} A - 

Note that A* = {<>}uA + . 

Now we have the tools to examine an interesting type in our LISP dialect. The type 
LIST[U] is useful in LISP, and our type system can express the semantics of this 
type. 

Given a cons cell O of type Twith car of type U (where Uis the set of legal paths for 
an object of type if), and cdr of the same type as O (i.e. any operation legal on O is 
also legal on cdr(O), making Ta recursive type), we have 

T = { <CDR> }* - { <CAR> } • U. 
An object of the type shown in Figure 2-2 might be generated by (LIST 1 2 3), 
where U is { <INT> } in this case. Note that T is a regular set, and can thus be 
accepted by a finite state automaton if U is a regular set 

Note also that one of the elements of T is the infinite path x, such that x ( = CDR 
for all positive integers L 

The examples we have presented have types which can be represented by regular 
sets. Solomon [22, 23] showed that the only types we should consider are the ones 
which can be represented by regular sets. We place an additional restriction, (the 
details of which are dependent on the programming language that the type system is 
being implemented for), that some regular sets are illegal as types. In our dialect of 
LISP, for example, the set { <CAR>, <INT> } is illegal, because there is nothing 
which has a CAR and is an integer. Thus for a given programming language there 



20 




Figure 2-2:Recursive type, with an object of the type r=CONS(INT,T) 
(Also known as LIST[INT1), along with the FSA which accepts 71 



are selector classes which provide the information to check for illegal sets like 
{ <CAR>, <INT> }. 

We require that the selector classes for a given programming language, partition 2 
into equivalence classes. 

In VimVal, each equivalence class in the selector classes represents a different 
"type class" or "type generator" (such as ARRAY, RECORD or INT). This 
method of partitioning 2 would allow us to generalize our type system to include 
abstraction, and this possibility is discussed briefly in the conclusion. It is not 
essential to our work on the type checking algorithm that the selector classes are 
formed according to the rule that each class corresponds to a "type generator". 
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The selector classes for our LISP dialect are 
- { CAR, CDR } 

-{INT} 

Some selector symbols can not be followed by any other selectors. Our LISP dialect 
does not allow paths of the form <INT,CAR,...>, because that would imply that 
there is some object which is an integer, which has a CAR. (It is not clear what such 
a path would mean, but we do not want it.) Thus, for a given programming 
language, some elements of 2 can only appear as the last element of a finite path. 

Notation: The set of terminators, a subset of 2, is the set, defined by the 
programming language, such that any path having a terminator in a non- 
final position is illegal. 

In our LISP dialect, { INT} is the set of terminators. 

In VimVal and our LISP dialect, the terminators correspond to "scalar" types, or 
"base" types. We do not, however, require that such a correspondence hold for our 
type checking algorithm to work. 

A few extra definitions are needed to define types. We want to be able to talk about 

the "first part" of a set of paths, and the "last part" of a set of paths, so that types can 

be described in terms of these properties. 

Definition 2-7: The head of U C 2 + is the set of first elements of the 
paths in U. 

heaa\U) ^ dgf \J^ uH 

Definition 2-8: The rest of a path a € 2 + is a with the first element 
removed. 

rest(a) = d , p such that <a j> • p = a 

Definition 2-9: The tail of U C 2 + is U with the first element of every 
path removed. 

tail(U) = def { a | 3 p € U where a = rest(p) } 
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Definition 2-10: If AT € 2 then the X-selecled tail of t/C 2+ is 
to/7/t/; =^ { resi(y) lyeUandy^X}. 

Now we can encapsulate all the type information for a given programming language 

into a type system. Type systems are dependent on their programming language: the 

correctness of a type system depends on the semantics of the programming language 

associated with it. We often refer to a type system as a programming language in this 

paper, because of this dependence. ( 

Definition 2-1 1 : A type system L is a three-tuple <2, C, 9> where 

- 2 is the set of selectors in L, 

- C is the set of selector classes, which partitions 2, 

- and 9 is the set of terminators. ?C2. 

In order to define type, we need to be able to talk about certain properties of regular 
sets which are easily defined recursively. One such property is that for any selector 
a, the a-selected-tail of a type, T, must also be a type (or be empty). This recursion 
could be a real problem: e.g. for the type LIST[U], the CD/?-selected-tail of the type 
LISTfU], is L1ST[U]. There is no obvious way to terminate the recursion. By 
constructing a finite state automaton (FSA) which accepts the regular set, we can 
perform the tests we are interested in without resorting to such infinite recursion. 
The following definitions, which describe properties of FSA, were adapted from 
[11]. 
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Definition 2-12: A FSA is a tuple (K,Z,o\s,F,3t,) where 

- K is a finite set of states, 

- 2 is an input alphabet, 

- 5 is a function mapping some subset of Kx2 into K, 

- s is a start state (s € K), 

- F is a set of accepting states (F C K), 

- and 9*> is a reject state (=*> € K), 

and 8(<&,a) is undefined for all a € 2. 

Definition 2-13: A configuration of an FSA is a pair (k,a) with UK and 

a € 2* 

Definition 2-14: A binary relation i- M holds between configurations of 

M, an FSA. (k,a) H- M (k\a) «=> 8(k,o^) = k\ and rest(a)=o\ In which 

case we say that (k,a) yields (k\a) in one step. We denote the reflexive 

transitive closure of H~ M as f- M . If 8(k,a^) is undefined, then 

(k,a) l- M (*,a). (ffj is the first element in the path a.) 

Definition 2-15: An FSA, M, accepts a path a if the following hold: 

- If a is finite then (s,<r) J~ M (/,<>) for some/€ F. 

- If a is infinite then it is not true that (s,a) H M (%,a) for any path 

<r\ - - ~-" - 

Note that if a FSA reaches a configuration (k, a), where 8 is undefined, then the 
FSA "hangs", and never accepts its input Specifically, if a FSA reaches the state a, 
then the input is not accepted. 

The set of paths accepted by an FSA is a regular set of paths, and is called the set 
that the FSA accepts. 
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We now have everything needed to define type. 

Definition 2*16: T, a regular set of paths, is a type in a programming 
language <2, C, 9> if there is some FSA, M = (K, 2, 8, s, F, &), 
accepting Tsuch that 

- M rejects <>. 

- Given a state k, if H k = { a € 2 | 8(k, a) is defined }, then H k is a 
subset of some selector class in C. 

- For every state k € K, and every symbol X € 2, if 8{k, X) € F, then 
X € 9. (Terminators occur only at the end of finite paths.) 

It is not necessary to force M to be unique in the definition of type, because if T is a 
type, and N is an FSA which accepts T, then N meets the conditions imposed on M 
in the definition of type. We leave this assertion without proof. 
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Chapter Three 
Type Checking 



Now that we have defined types, we can define type-checking by specifying the 
interactions betweenjypes, and programs. A program has a set of nodes that we 
want to type (to type node N is to assign a type to AO, and some information about 
the types of the nodes (which we call operators). We first lay some groundwork, 
defining concepts such as program and type-assignments, and then define 
type-correctness in terms of the number of possible type-assignments for a program. 
We conclude this chapter by showing that type-correctness is well defined and 
decidable. 

3.1 Type Assignments and Programs 

Our type checking algorithm will try to infer the type of every node in a program 
from its "context". We need to specify what we mean by the "context" of a node, 
and to do that we will define three kinds of "operators" on nodes: parameterized 

restrictions, containers, and closures. 

Notation: The set of node names is denoted Jf. Jfmust be disjoint from 
2. 

Jf might be infinite, but any given program will only use a finite subset of Jf. 

A type assignment gives us a way to associate a type with a node in a program. 



Nodes arc roughly equivalent to expressions, except that there may be some expressions that we 
will not want to type (for example expressions in a module which is never used), and there some 
things that we might want to type which are not expressions (for example type declarations). See [19] 
for a more complete discussion of nodes. 

26 



Definition 3-1: A type assignment R is a regular subset of Jf o 2 such that 
VxOicadfRltailJRjisatype. 

Notation: The set of all type assignments is denoted SOTA al! , Subsets of 

SOTA all are elements of the power set of SOTA all , written 3>(Sota aI1 ). 

There is an interesting isomorphism between type assignments and mappings from 
P.NodeNames to types. Given a program P, and a type assignment T, there is a 
mapping i/iJf-^sueh that U(n) = tail^T). Conversely, given a mapping U, there is 
a type assignment T, such that tail^T) = U(n). We named type assignments type 
assignments because they are isomorphic to mappings which assign a type to every 
node, and we will freely, without warning, use this isomorphism when it is 
convenient. 

We are interested in finding which type assignments are consistent with the 

"context" in which each node appears in a program. 

Definition 3-2: Given an alphabet J., if a € JL ,a finite, and R is a regular 
set over J., then the regular set after a in R is 

after o {R) = dgf { a' | o*a' € R }. 

Note that for a symbol x € J., tailJR)= after <X /R). 

A parameterized restriction gives us the ability to say that two nodes are the same 
type. First we can specify the two nodes n and n' that we are interested in by giving 
two paths, a and a' respectively. Any FSA which represents a type assignment 
which is consistent with a given parameterized restriction has the property that if we 
start from the start state of the FSA and a and o-' lead to states k and *' respectively, 
then the languages accepted by starting at k and k' must be the same. This is 
equivalent to saying that there must be some FSA accepting the same language such 
that k=k\ We formalize this with the definition of state-equivalent, and then define 
parameterized restriction. 
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Definition 3-3: Given a regular set R, two paths a and a are 
state-equivalent if after (R) = after .(R), in which case we write a = R a. 

Definition 3-4: Given a set of regular sets A (with every regular set in A 

over a fixed alphabet J), for every pair (a, a) € J. XJL , with a and a 
finite, there is a set of regular sets { R \ R € A and a == R a }. We call this 

set the parameterized restriction of (a, a'), and write the set as A 



0=0 



A container gives us the ability to say that the type assignment for our program has a 
given path a in it. 

Definition 3-5: Given a set of regular sets A (with every regular set in A 

over a fixed alphabet J), for every a € JL there is a set of regular sets 

{ R I R € A and a € R }. We call this set the container of a in A, and 

write the set as A . 
o 

A closure gives us the ability to say that a given node must have selectors which are a 

subset of some finite set of selectors. We choose the node by giving a path, and 

specify the set by listing it. 

Definition 3-6: Given a set of regular sets A (with every regular set in A 

over a fixed alphabet J), a finite set of symbols 9Ci, and a finite path 
a € J. , there is a set of regular sets 

{ R I R € A, and head(after (R)) C $ }. We call this set the closure under 
% ofR selected by a, and write the set as A ^,. 

Now that we have defined the kinds of restrictions we want to make on type 

assignments, we define an operator to be one of those restrictions. A program will 

actually consist of some nodes and some operators. 

Definition 3-7: An operator OP is a subset of SOTA all which is either a 

parameterized restriction, a container, or a closure of SOTA aU . 

Notation: If OP is an operator, then the operands of OP are the node 
names mentioned OP. 

The meaning of an operator is that if there is some restriction on the types of some 
nodes in a program, the operator contains the information describing the restriction. 
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For example, if, given a program, we have an operator which requires (informally) 

that "if the type of node 1 is T, then the type of node 2 is LIST(T]" then the operator 

is { R | R€SOTA al] and <1> = R <2,LIST> }. A more concise way of writing this set 

is (SOTA al ,) <]>s<2 j /v7> . 

Definition 3-8: A program P is an ordered pair (NodeNames, ops), 

- where NodeNames is a set of node names (a finite subset of Jf). 
referred to as P.NodeNames, 

- and ops is a finite set of operators, where each operators operands 
are a subset of the names of the nodes in a program, (i.e. 
Vx€ops, heaa\\) € P.NodeNames.) This set is referred to as P.ops. 

Notation: The set of all programs is referred to as II. 

By taking all of the operators in a program, and combining their information, we can 

deduce the type assignment for a program. 

Notation: The intersection of all the operators in a program is called the 
complete-restriction of the program. 

Definition 3-9: 03GS&K II -♦ ^(SOTA^,) is a function mapping programs 

into sets of type assignments. Given a program P, oaGSsgt/ 3 ) is defined by 

oxse*P) Sikf n xCRops x. 



3.2 Type Correctness ■ "There is a solution" 

Definition 3-10: A program P is type correct if |03&S€9(P)| = 1. 

Definition 3-11: A program Pis type ambiguous'^ |03GS69(P)| > 1. 

Definition 3-12: A program Pis type overconstrained if |03GS83(P)| = 0. 

Theorem 3-13: Type correctness is well defined, and is independent of the 
order in which the restrictions are examined for a given program. 

Proof: Set intersection is associative and commutative. I 
Peacock's proposed implementation of type checking for VIMVal [19] used a graph, 
through which information about the restrictions of the operators of a program was 



29 



propagated. Peacock's thesis posed the question: "Can changing the order in which 
constraints are propagated through the graph change the final answer?". We can 
answer "no" to this question because if a and j3 are such that given a regular set R, 
R and R« are operators, then it is true that: 

We accept without proof the following: 

Proposition 3-14: If a program is type correct, then no "type errors" (in the 
intuitive sense) will occur while running the program. 

This is difficult to prove, because it is dependent on the semantics of the language 
the program is written in. Even if the language's type system conforms to our 
model, the correctness of type correctness depends on how accurately the set of 
operators for the language is described. Given a careful semantic model for a 
programming language, and a set of operators which are consistent with the model, 
a proof of this proposition would involve showing that if the local constraints 
imposed by the operators are true then no type errors will occur at run-time. 
Milner [16] proves this proposition for the language he considers. We will leave this 
proposition unproven for VimVal. 

3.3 An Algorithm for Determining Type Assignments 

Theorem 3-13 shows that we can talk about type correctness for incompletely typed 
programs with recursive types, and gives a definition of type correctness, but it does 
not give us an algorithm for determining those types. In this section we will prove 
that there is an algorithm for computing the type assignment for a given program. 

If x is the intersection of a finite collection of operators, we need to show that it is 
possible to compute whether M=0, |x| = l or |xj>l. If |x|=l, i.e. x = { y } for 
some type assignment y, then we need to show that we can actually compute y. 
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Specifically, we need to be able to build a FSA which accepts y, so that the VimVal 

compiler can use the type information to compile a program. (Other representations 

of regular sets would be equivalent to building a FSA which accepts;' (11].) 
Theorem 3-15: Given a program P, the type correctness of P is decidable. 
If P is type correct, then the type assignment is computable. 

Proof: Suppose P has operators equal to the union of some containers 
described by the set of paths { x- x | i=\,...,n }, and some parameterized 

restrictions described by the set of pairs of paths { ty, Zj) J /=l,...,/w }, and 
some closures { (a., w.) e 9 (J.) X J* \ i= 1,...,/ }, where X = MWZ. 

We need to determine how many type assignments (which are regular 
sets) there are that are elements of every operator in P. Since type 
assignments are regular expressions, we can consider the FSA's which 
accept the type assignments. In general, there will be more than one FSA 
which accepts a given type assignment, but we can consider, without loss 
of generality, the set of FSA's with no more than p states, where 

p = | P.NodeNames] + 2^ \x} + 2" x Qy} + \zf) + 2f =1 \w) + 3. 

The reason we can make this reduction is that the set of FSA which accept 
the languages described by any operator all have a bounded number of 
states, thus the set of FSA which accept languages in the complete 
restriction of a program also have a bounded number of states. Our 
bound is correct because if there are two languages meeting the 
restrictions of operators of the program, then there are two which need at 
most p states: it is possible that every time a node or symbol is mentioned 
by a operator, another state will be needed, plus we add one for the 
rejecting state, one for an accepting state, and one for an "unconstrained" 
state which can be used to make type assignments different for two FSA 
(assuming the unconstrained state is reachable from the start state). 
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Since it is decidable whether the language accepted by a given FSA is in a 
given operator, we simply need to generate a list of all FSA's with less 
than p states, filter out the ones which do not accept a type assignment, 
and determine which of them are members of every operator in P. Given 
this new set of FSA which are in every operator of P, we need to 
determine whether they all accept the same language, which is decidable. 
If they do, then the program is type correct If they do not, then the 
program is type ambiguous. Of course, if there is no FSA which accepts a 
language which is in every operator of P, then P is type overconstrained. 

If a program P is type correct, then the type assignment is the language 
accepted by one of FSA's that is found by the algorithm described above. 

It is not really satisfying to be forced to use an algorithm as inefficient as the 

algorithm described above for determining type correctness. This algorithm is 

exponential in the size of the input program since the the number of FSA's of size p 

is exponential in p. 

VimVal, the actual language we are trying to type check, has very stylized 
operators, we were able to find an algorithm for type checking which is usually more 
efficient. Chapter Four describes VimVal in more detail. 
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Chapter Four 
Type Checking in VimVal 



This chapter describes the types of VimVal [24], and how Vi'mVai. interacts with 
the type system developed in chapters 2 and 3. We deal with function recursion and 
polymorphism so that our type system can handle VimVal, then we describe the 
operators of the VimVal language. 

4.1 The Semantics of Modules 

A VimVal program consists of a set of modules, which can be compiled separately. 
Modules may use free names, which are references to other modules. The 
bindings of the free names are resolved at link time, possibly with the explicit help 
of the programmer. VimVal allows a module M with a free name "P" to to bind 
"P" to N, even though the name of module N is not "P". Unfortunately, the 
programmer may be required to help the linker resolve free names. 

Every module is really a generator: when a module is bound to a free name, the 
module is augmented in whatever ways are possible and necessary to bring it into 
conformance with its use (i.e. it is copied, and then modified). Thus, when a 
programmer uses the built-in ARRAY-SIZE function in VimVal, a copy is made so 
that whatever type constraints are added to the ARRAY-SIZE function (for example 
if the programmer uses it on an array of integers) are not propagated to other uses of 
the ARRAY-SIZE function. 

Note that we do not require that there be a unique type assignment for each 
module, only that there be a unique type assignment for each augmented version of 
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every module. The semantics of modules does not specify that a module must be a 
function. A module could be some other kind of value, or even a second-class value 
such as a type, since the type restrictions for each of these cases could be expressed 
as operators. 

After a copy of a module is made, the type checking system must decide on exactly 
one type for the module. This implies that all the types of the subexpressions of the 
module must have exactly one type: in particular the arguments to functions must 
have exactly one type. This precludes certain programs which use "run-time" 
polymorphism (such as the "standard" LISP interpreter). 

4.2 Recursive Functions 

VimVal allows functions to call each other recursively, with the restriction that 
there can be no mutual recursion between modules. (Mutual recursion between 
functions defined inside a module is allowed.) All recursive functions are really 
treated as higher order functions, which pass other functions, perhaps copies of 
themselves, around. This implies that recursive functions, whether directly or 
indirectly recursive, must be converted to passed arguments. Because arguments 
must have a fixed type, functions must be of fixed types when used recursively. 
Recursion is treated as a syntactic sugar for functions which explicitly pass other 
functions around [J]. Program examples 4-1 and 4-2 illustrate a simple case of the 
desugaring process. 
Program Example 4-1: 

% An example of recursion 
function fact(1:INT) RETURNS (INT) 
IF 1<=1 THEN 1 
ELSE 1*fact(1-l) 
ENDIF 
ENOFUN 

Program example 4-2 shows program example 4-1 "desugarfied". The approach 
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taken is to translate fact into a routine which calls dofact, which does the actual 

computation. 

Program Example 4-2: 

function fact(i:int) RETURNS (INT) 

facttype = FUNCTYPE(INT.FACTTYPE) RETURNS (INT) 
function dof act( i : int.f :f acttype) RETURNS (INT) 
if i<=l then 1 
else i*f (i-l.f ) 
end if - 
endfun % dofact 
dofact(i. dofact) 
endfun % fact 

There are more complex cases of mutually recursive functions. They are dealt with 
in the general case by translating 

a : FUNCTION(<args>) (<rets>) IS 

expression -with -these- subexpressions : 

END a 

P : FUNCTION^..) END fi 

y : FUNCTI0N(...) END y 

where fi and y call a (directly or indirectly) into 

a : FUNCTION(<args>) (<rets>) IS 

do-o : FUNCTION(<args>,a,b,c) (<rets>) IS 
expression -with- these-subexpress ions: 
b()(. .. .a.b.c) 
c()(...,a,b,c) 
END do-a 

do-p : FUNCTION(...,a,b,c) END 6o-fi 

do-y : FUNCTION(...,a,b,c) END do-y 

do-a (<args>, do-a, do-/9,do-y) 
end a 

Of course this only translates o. A similar translation would need to be made for fi, 
so that fi could be called directly. The following are some design considerations that 
we took into account when we made this decision: 
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- We wanted VimVai. to have a decidable type system, and found that, 
theoretically, if we do not "fix" the type of recursive calls, the type 
becomes undecidable [8]. 

- We wanted an easy to understand type system. Aesthetically, an unfixed 
type becomes very confusing on even rather simple examples of 
recursion (see program example 4-3). 

- Practically, very few programs need the extra expressive power of 
unfixed types on recursion [16]. 

Program Example 4-3: 



function F(A,B) 








F(A.B) 








F(B,A) 








ENDFUN X F 








F(l.l.O) X difficult to 


type 






function F1(A,B,F2.F3) 








F2(A,B,F2,F3) 








F3(B.A,F3.F2) 








ENDFUN X Fl 








Fl(l.l.O.Fl.Fl) X Much 


easier 


to 


type 



It is very difficult to give F a type in this example, because it is acceptable to pass F 
anything as arguments, but the arguments are switched halfway, resulting in a 
confusing type. If we write Fl instead, we can get the same meaning, but the 
program is much easier to type: 

FlaTYPE= FUNCTYPE(INT,REAL,FlaTYPE,FlbTYPE) RETURNS(...) 
FlbTYPE= FUNCTYPE(REAL,INT,FlbTYPE,FlbTYPE) RETURNS(...) 
The type of Fl when called at the top level is FlaTYPE. 
The type of the third argument is FlaTYPE 
The type of the last argument is FlbTYPE 

An example of the power of this kind of recursion is given in program example 4-4, 
which shows how a standard LISP function, is easily written recursively in VimVal. 
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(We also omit of the type declarations to demonstrate the ease of use of type 

inference.) 

Program Example 4-4: 



function LENGTH(l) 
tagcase 1 
tag NullVal: 
tag ConsVal : 

l+length(l .cdr) 
endtag ^ 
endfun X length 



4.3 "Constant" copying 

After dealing with recursion, the remaining free variables in each module are treated 
as invocations of a generator (either of a type, or a value), which does away with 
polymorphism (since after being copied, every node must be assigned exactly one 
type). 

4.4 The Restrictions for VimVal's operators 

The actual restrictions for the operators of VimVal are presented in appendix A. 
VimVal does not need the full expressive power of operators: we have described 
VimVal using: 

Simplified closures 

Closures are specified by a set of symbols % and a path a. 
VimVal operators are simple enough that a can always be 
written as a path of length zero or one. If the path is of length 
zero, then the closure gives a complete list of all the node-names. 
Our implementation assumes that the node-names mentioned in 
the operators are all the node-names in the program, which is 
slightly easier to use than if the implementation required that an 
explicit list of all the node-names be presented to the type 
checker. If the path is of length one, then a must be of the form 
<n> where n is a node-name. 



37 



Simplified containers 

Containers are specified by a path a. VimVm.'s operators can be 
written in such a way that all the containers are specified by paths 
of length two: the first element is a node-name, and the second is 
a terminator (which is a selector). 

Simplified parameterized restrictions 

Either, we have A <n>=<m> or A <n > =<m> > where m and n are 

node-names, and a is a selector. (The general form of operators 
allows parameterized restrictions of the form A «, where a and 

/? are arbitrary elements of ^2 ). 

These restrictions allow a great improvement in the implementation of type 
checking in VimVal. 



4.5 An Efficient Algorithm for Type Checking in VIMVAL 

Our technique is to maintain an equivalence relation over node-names, which 
reflects which nodes are of the same type, information about the closure for each 
node, and information about the transitions that any FSA which represents some 
member of our complete-restriction, must follow. Hence, in most cases, we are able 
to rapidly reduce the upper bound of the number of states that FSA which accept 
our complete-restriction, by considering each equivalence class in the equivalence 
relation to represent one state of the FSA. The system requires at least one node- 
name in every equivalence class to have a closure restriction (because otherwise, it 
might be possible to have extra transitions leading from any state, destroying the 
uniqueness of the type assignment). 



38 



Definition 4-5: A Mela Finite Stale Automaton (MFSA) is a tuple 
(A*, JL, 36, s, F, 2, 6\ C), where 

- K is a set of states, 

- J. is an accepting state (J. € A"), 

- 3G is an equivalence relation over K(lfk€ A* then %(k) the class of 
k under 3G), 

- s is a start state (s € A), 

- F is a set of final states (F is the union of some of the classes of 36, 
which implies Fc K. J.£F), 

- 2 is a set of symbols, 

- 8 is a function mapping (X X 2) -+ (3G U {0}), 

- and C is a function mapping 36 -» 9(2). 

Definition 4-6: A configuration of a MFSA is a pair (k, a) where k € K 
and a € 2 

Definition 4-7: A binary relation h- M holds between configurations of M, 
a MFSA. (k, a) H M (k\ a) » a' = res/fa) and S(3G(*), c^) = 3G(Jfc'). 

The reflexive transitive closure of \~ M is denoted as t-J^. 

So far, MFSA are very similar to FSA. Now we are going to define some interesting 
operations which allow us to perform our type checking algorithm. First we are 
interested in restricting the set of FSA's that our MFSA represents to those which 
correspond to one of the cases of a simplified parameterized restriction. (See section 
4.4.) 
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Definition 4-8: If A/ is an MFSA, and / andy are states then 
equalc{M,i,j) = Jef (K, 3G\ s, F\ 2, S\ C), 

where b € %\a) (i.e. a and b are in the same class under 3G') «=» there is 

* M (a, <>), and (/, a) \-* M 



* * 

some finite path a such that (/, a) J- M (a, <>), and (/, cr) H M (Z>, <>), 



and F'is the union of all the elements of 3G' which have some element in 
F, 

and 8X%\a), a) = %\b) « 3 (x, >^) € 3GXfl)x3GX6), such that 
8(%(x), a) = %(y) t 

andc'(y) = n z€3G , (> , ) c(3G(z)). 

The equate operation on MFSA gives us the set of FSA's in which a given pair of 
states are always state equivalent. 

Next we are interested in the case of a container. 

Definition 4-9: If M is an MFSA, k € K, and a € 2, then 

has-path(M, a) = de f Af, 

where if there is some x € 8(%(k), a) then AT = equate(M, x, k), 
otherwise AT = M, except for the transition function 8\ which is the 
same as 8, except that 8'(X>(k), a) = %(X). 

The next definition allows us to deal with the second case of a simplified 

parameterized restriction. (See section 4.4.) 

Definition 4-10: If M is an MFSA, /,;' € K, and o € 2, then 

has-subpath-to(M, i, a, J) s^j-Af, 

where if there is some x € 8(%(i), a) then AT = equate(A1, x, j), otherwise 
At' - M except for the function 8\ which is the same as 8, except that 
8\%(i), a) = 3G(/). 

Note that a MFSA describes a set of type assignments if the following conditions 
hold: 
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1. For any node-name «, { a j 8(%(ri), a) * } is a subset of some selector 
class, and is also a subset of C(36(w)). 

2. If « € #, w € F, a £ 2, and 5(3G(n), a) = 3G(w) then a is a terminator. 

3. For every X € 2, S(3G(.x), A 1 ) = 0. 

Note that a MFSA describes a single type assignment if the following condition 
holds: ^ 

1. For every node-name «, { a \ 8(%(n), a) * } = c(36(w)) * 0. 

To compute equate(M,i,j), has-patli(M,i,a), and has- subpath- tc{M,i,aj) only 
takes on the order time a; 2 in the worst case, and usually is much better. 

To compute the type assignment for a program, we perform the following: 

1. Build the MFSA with all the closures matching the closure operators in 
the program. (This is easy: if a node z is closed with the set a in the 
program, we have the function C(z) = a. If a node z has no closures in 
the program, then C(z) = 2.) 

2. Construct new MFSA's, by composing the MFSA operations which 
correspond to the operators in the program. It does not matter which 
order they are composed in, since the MFSA operations describe set 
intersection: if A is a program operator corresponding to some MFSA 
operation F, and B is a set of type assignments corresponding to some 
MFSA M, then AnB is a set of type assignments corresponding to the 
MFSA F{M). Here is the correspondence between program operators 
and MFSA operations: 

Program Operator MFSA Operation 

(SOTA all ) <n> _ <m> equate{M,n,m) 

(SOTA aU ) <; ^ ff>5 - <m> has- subpath - to(M,n,a,m) 

(SOTA all ) </ ^ ff> has-palh(M,n,a) 

3. Test to see if the MFSA represents a set of type assignments (in which 
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case we know that the program is not type-overconsi rained), and if the 
MFSA represents a single type assignment (in which case we know that 
the program is not type-ambiguous). 

Appendix C contains the listing of a CI.U [13] program to perform type checking on 
VimVal 
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Chapter Five 
Conclusion 



Did We Meet Our Goals? 

While the VimVal compiler is not yet finished, and we have no actual experience 
using VimVal, we feel confident that VimVal has much the power and ease of use 
stated in our original goals. This power is illustrated by a few examples in Appendix 
B. We believe that VimVal provides a notation for polymorphic programs that is 
easy to learn and use, and we proved that VimVal is type safe, meeting the high 
level goals outlined in the introduction. The actual type rules of VimVal are fairly 
simple: 

- There must be exactly one legal type for every value in a VimVal 
program. 

- The type of a value is constrained by the operators that operate on the 
value. The VimVal manual [24], and appendix A, describe the 
constraints that each operator places on its operands. Intuitively, the 
arguments have to be used in a "consistent" way. (This is easy to state, 
but sometimes rather difficult to apply in practice, since the human 
programmer may have to actually use our algorithm to determine the 
type assignments of a program.) 

- Recursive functions are of a fixed type, but other modules are copied 
before they are compiled, which allows polymorphic functions to be 
written. 

VimVal requires a fairly complex type checking algorithm, which may require 
quite a bit of computation in the worst case. We believe that this complexity is 
acceptable in the light of VimVal's ease of use, and given that VimVal is designed 
to run on a highly parallel computer. 
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Type inference allows programmers to write code which is difficult to read. 
Empirically, we could argue that if type inference is difficult for a computer, it is 
probably also difficult for people who are reading a program, (e.g. We found it 
difficult to infer "in our heads" the type of the Y-combinator (shown below) but our 
type checking algorithm correctly computed the Y-combinator's type.) 

Comparison with other Work 

VimVai.'s type system is different from Milner's [16], in that we allow "ad hoc 
polymorphism" in the case of certain built in operators (such as +, which can take 
real or integer arguments). Milner discussed the possibility of adding such ad hoc 
polymorphism. 

A more important difference between our type system and Milner's is that we allow 
recursive types. The recursive types allow us to type Curry's Y combinator (which 
Milner's system can not type). 
Program Example 5-1: 

function Y{f) 
function fl(x) 

f(x(x)) 

endfun 
fl(fl) 
endfun 

which could be re-written without type inference. 
Program Example 5-2: 

Ytype = Functype(Ytype) returns(Ytype) 
function Y(f:Ytype) returns(Ytype) 
function fl(x:Ytype) returns(YType) 
f(x(x)) 
endfun 
fl(fl) 
endfun 

Except for the above differences, our concepts of type and sets of type assignments 
are not really different from Milner's. Instead of finding the "most general type" of 
an expression, and then instantiating the expression with specific types to get a 
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"monotype", as Milner docs, we copy the expression, and then deduce what the type 
of the expression must be. These approaches are equivalent, because a "monotype" 
is a member of a "most general type" if and only if there is context in which the 
expression could have type corresponding to the "monotype". 

Our approach to types caiv be generalized to include type abstraction [12] by 
defining a correspondence between the legal operations on user defined abstract 
types and an augmented selector alphabet: abstract types are sets of objects with a 
set of operations [17], and a type checking algorithm would simply generate the 
additional selectors that the abstract type needs (which are different from the 
previously defined selectors), and put them all in the same selector class. None of 
the new selectors would be terminators. The rest of our type checking system would 
apply to this new system. We did not make this generalization because we wanted 
to limit the scope of this work, and because VimVal is perceived as a "number- 
crunching" language, which does not require the powerful and easy to use 
abstraction mechanisms that are found in CLU [13]. VimVal does have a type 
abstraction mechanism, which involves encapsulating a data type inside a procedure, 
but the mechanism is not easy to use (syntactic sugar would help solve this 
problem [14]), and it is impossible to maintain a representation invariant for objects 
of a given abstract type [12] (such as a requirement that an array be a sorted array). 

A View from Above 

The "high level goals" for the MIT Computations Structures group were well stated 

in [3]: 

to present a system model for a kind of ideal multiprogrammed computer 
system, one that would serve many users in a way permitting sharing of 
the products of their individual programming efforts consonant with the 
principles of program modularity -- the ability to build program units 
which can be combined to form higher units, etc. 

We believe that the development of the type system for VimVal is an important 
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milestone in the development of the VimVai. language, which in turn represents an 
important step on the path to that high level goal. 
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Appendix A 
VimVal Operators and their Restrictions 



This appendix describes the actual operators that are in VIMVAL. Much of this 
appendix is borrowed from Peacock's [19] appendix A. 

We adopt the convention that every operator has n input nodes, named x lt ...,x B and 

m result nodes named y v -..,y m - An operator is set of regular sets, and we give the set 

for each operator. 

2 = 

{ REAL, INT, CHAR, BOOL, NULL, ARRA Y, STREAM, 
GET-a, IS-a, ARG-n, RET-n 
| a is a legal VIMVAL identifier, and n is a positive integer } 

The correspondence between selectors in our type system, and the "type classes" in 
VimVal are as follows: 



selector 

REAL 

INT 

CHAR 

BOOL 

ARRAY 

STREAM 

GET-a 

IS-a 

RET-n, ARG-n 



type class 

REAL 

INT 

CHAR 

CHAR 

ARRAY 

STREAM 

RECORD 

ONEOF 

FUNCTION 



The terminators are 

{ REAL, INT, CHAR, BOOL, NULL }. 

The selector classes are: 
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REAL }. 

INT), 

CHAR }, 

BOOL }, 

NULL }. 

ARRAY), 

STREAM }. 

GET-a | a Is a legal VIMVA1. identifier }, 

IS-a | a Is a legal V1MVAI. identifier }, 

ARG-n, RET-n \ n is a positive integer }. 

We will call the set of all type assignments 0. 

There is a little bit of added complexity due to the non-uniform polymorphism of 
some of the operators in VlMVAL. The + operator, for example allows arguments 
which are either all integers or all reals. We can deal with such finite disjoint unions 
of operators, by computing a separate complete-restriction for every possibility. We 
will refer to { INT, REAL, CHAR, BOOL } as RICB, { REAL, INT } as RI, and 
{ REAL, CHAR } as RC. 

Most operators in the VimVal language correspond to more than one operator as 
defined in definition 3-7. Rather than write the operators in the form (SOTA all ) a 
for / in some set of integers, we will write the restrictions in standard set notation. 

We will also choose not to mention the closure operator for operators which 
mention selectors which are in selector classes of order one. This set of selectors is 
OWNCLASS = { REAL, INT, CHAR, BOOL, NULL, ARRAY, STREAM }. In 
general, if an operator specifies that there is some path <z, a>, with 
a € OWNCLASS, then there is an implied closure operator of the form 

(SOTAallW 
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A.l Basic Operators 

A.I. 1 Error Tests 

There are three universal error tests in VlMVAL Their names are is-undef, 
is-miss-elt, and is- error. They have 1 input and 1 output Their only constraint 
is that the output must be boolean. 
{ S€0 | <y v BOOL> € S } 
A. 1.2 Equal and Not Equal 

Equal, (=), and not equal, (~=), are in a special class because they constrain their 
argument types not to a specific type but to a set of four possible types, namely real, 
integer, char, or bool. They have 2 inputs and 1 output. The inputs must be the 
same type and the output is a bool: Thus there is one operator for every p € RICB. 

V p € RICB: 

{ S€0 | { <x lt p>, <x 2 , p>, <y t , BOOL> } C S } 
A.1.3 Boolean Operators 

There are two classes of boolean operators in VimVal. The first class has two 
arguments, the second has one. 

A.l.3.1 Two Argument Boolean Operators 

The members of the class with two arguments are and, (&)', and or, fl). Their 

constraints are that all the inputs and results must be bool. 

{ S€0 | { <x v BOOL>. <x z . BOOL>, <y v BOOL> } C S } 

A.l.3.2 One Argument Boolean Operators 

The second class has only one member, the not, (~) operator. The input and result 

are both bool. 

{ SCO | { <x r BOOL>, <y v BOOL> } £ S } 
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A.l .4 Type Conversion Operations 

There are three operations intended to convert one data type into another. These 
are real, character, and integer. They all have one input and one result 
real { S£0 \ { <x v I NT>, <y v RE A L>] CS} 

integer V p € RC: { SCO | { <x { , p>, <y v INT> } C S } 

character ~ { S€0 1 { <jc r INT>, <y v CHAR> } C S } 
A.l. 5 Real and Integer Operations 

Most real and integer operations have the same names. Those that do are divided 

into four classes. There are some special cases, which are described after the four 

classes. 

A.l.5.1 Binary Operators 

The first class takes two arguments and returns one result, all three types being the 

same type, and being real or integers. The members of this class are plus, (+); 

minus, (•); multiply, (*); divide, (/); max', and min. 

V p € Rl 

{ SCO J { <x x , p>, <x z , p>, <y v p> } C S } 

A.l.5.2 Unary Operators 

The next class has one argument and one result, both of the same type, and both 
either integer or real. The members of this class are negation, '-'; and abs. 

V p € Ri 

{ S€0 ] { <x lt p>, <y lt /&■■>£■**■ 

A.l.5.3 Relational Operators 

The next class has two arguments and one result. The arguments must be the same 
type, and be integer or real. The result is a boolean. The members of this class are 
>,<,>=, and <=. 

v p e ri 

{ S€0 j { <Xj, p>, <x 2 , p>, <y v BOOL> } C S } 
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A.l.5.4 Exception Predicates 

The fourth and final class of real/integer operations has one argument and one 
result. The argument can be real or integer, and the result is a boolean. The 
members of this class are is-pos-over, is-neg-over, Is- unknown, 
is - zero - divide, is - over, and is - arith - error. 

V p € Rl 
{ S€0 | { <x v p>, <y lt BOOL> } C S } 

A. 1.5.5 Special Cases 

There are five operations that operate on real and integer types which do not fit into 

the above classes. The first of these special cases is mod, with two arguments and 

one result, all of which are integer. 

{ S€0 | { <x v INT>, <x z , INT>, <y v INT> } C S } 

The second special case is exp (which computes x^i), with two inputs and one 
result. If x 2 is REAL then all are real, and if ^ is INT then all are integers. 
{ SCO | { <x v REAL>, <x z , KEAL>, <y v REAL> } C S } 

{ S€0 | { <x v INT>, <x 2 , INT>, <y v INT> } Q S } 

{ S€0 | { <x v REAL>, <x z , INT>, <y t , REAL> } Q S } 

The final three special cases are is— pos- under, is— neg- under, and is- under, with 
one input (a real) and one output (a boolean). 

{ S€0 | { <X t , "REAL">. <i(y) 1 . "BOOL"> } C S } 

A.1.6 The empty operation 

The empty operation has no inputs, and one result: a string or an array. There is a 

"dummy" node called z which is used for technical reasons: 

{ S€0 | <^, ARRAY> = s <2> > 

{ S€0 | { <y lt STREAM> = s <z> } 
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A. 1.7 Array Operators 

A.1.7.1 Array-fill 

The array-fill operator has three inputs and one output. The first two inputs are 

integers, and the output is an array of type x 3 . 

{ S€0 | { <x v INT>, <x z , 1NT> } C S. 
ancLXyj, ARRA Y> = s <x 3 > > 

A.l.7.2 Select 

The select operator ([]) has two inputs, an array and an integer, and an output, an 

element of the array. 

{ S€0 | { <x 2 , INT> } C S, and <x t , ARRAY> = s <y t > } 

A. 1.7.3 Append 

The append operation takes three inputs and gives one result The first input, the 

last input, and the output are all arrays of the same type. The second input is an 

integer. 

{ S€0 | { <x 2 . INT> }CS, 

and <*!« ARRAY> = s <x 3 , ARRAY> = s <^. ARRAY> } 

A.l.7.4 Create-by-elements 

The create-by elements operator [:] is takes «>1 inputs and gives one result The first 

input is an integer, the output is an array of the second input The rest of the inputs 

must be the same type as the second input 

{ S€0 | { <x v INT> } C S, and <x 1 >. <y v ARRAY> for /€{2,..../i> } 

A.l.7.5 Array To Integer Operators 

The following three operators have the same constraints: arraylimh, arrayliml and 

array size. They take an array input and give an integer result We need a dummy 

node named z. 

{ S€0 | { <y v INT> > C S. and <jc 1§ ARRAY> = s <z> } 
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A. 1.7.6 Array-adjust 

The array-adjust operator takes three inputs and gives an output. The first two 

inputs are integers. The last input and the output are arrays of the same type. 

{ S€0 | { <*,, 1NT>, <x z , JNT> } C S. and <* 3 . ARRAY> = s <^> } 

A.l.7.7 Array-addh and Arrayaddl 

The operations array-addh and arrayaddl both take two inputs and yield an output 

The first input and the output are arrays of the the second input's type. 

{ S€0 | <Xj, ARRA Y> = s <x z > s s <y v ARRAY> } 

A.l.7.8 Arrayremh and Array-reml 

The operations arrayremh and array-reml both take one input and give one output 

The input is an array of the output's type. 

{ S€0 | <jc r ARRAY> = s <y t > } 

A.I. 7.9 Array-setl and Array-seth 

The operations array sell and array seth take an array and an integer and give an 

array output The first input and the output are arrays of the same type. 

{ S€0 | { <x z , INT> } C S, and <x v ARRAY> = s <^. ARRAY> } 

A.l.7.10 Concatenate and Join 

The operations concatenate and arrayjoin takes two arrays, and give one array, all of 

the same type. 

{ SCO | <x lf ARRAY> s s <x 2 , ARRAY> = s <j> r ARRAY> } 
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A. 1.8 Stream Operations 

A. 1.8.1 Stream Creation 

The stream operator allows n inputs and one output There is really one operator 
for every non-negative number n. (We will assume that there is at least one input 
If not, we need a dummy input, which we can call x r ) The inputs must all be the 
same type, and the output is a stream of that type. 

{ S€0 | <*.> = s <y v STREAM> for / € { 1 n } } 

A.l.8.2 Stream Null 

The null operator takes a stream and returns a boolean. We need a dummy node 

named z. 

{ S€0 | <y v BOOL> € S. and <x r STREAM> = s <z> > 

A.l.8.3 Stream First 

Theirs/ operator takes a stream[T] and returns a T. 

{ s€0 | <x lt STREAM> = s <y t > > 

A.l.8.4 Stream Rest 

The rest operator takes a stream and returns a stream of the same type. We need a 

dummy node z to describe this restriction. 

{ S€0 | <*!> =s s <^>. and <x v STREAM> = s <z> } 

A.l.8.5 Stream affix 

The affix operator takes a stream|Tj and a T, and returns a stream|T]. 

{ S€0 | <x v STREAM> = s <x z > = s <^. STREAM> } 
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A.I. 9 Record Operators 

A.l.9.1 The Record Constructor 

The record operator takes n inputs and gives one output. Note that there is 

a l a n 

one record operator for every finite set of VlMVAl. identifiers. Assume that a v ..., 

a are sorted lexicographically. We must be sure to exclude other selectors on the 
output 

{ S€0 | <y lt GET-ar> = s <x.> for i€ {1, n}, 

and <y v GET-p,...> I s if p<t { 04 « n } } 

A. 1.9.2 Record Selection 

The select operation on records takes a record and gives a value which was stored 

in the record. Note that we must be careful to allow paths that start with GET-fi, for 

all /?* a, because the select path does not say anything about the other selectors. 

{ S€0 I <x v GET-a> = s <y{> } 

A.l.9.3 Record Replace 

The replace operation on records takes a record and a value, and returns a new 

record of the same type. 

{ S€0 I <Xj> == s <^>, and <* lt GET-a> = s <X 2 > } 

A.1.10 Union Types 

A.l.10.1 Union Make 

The make operator takes an object and returns a oneof. 

{ S€0 I <Xj> = s <y t , IS-a> } 

A.l.10.2 Union Is 

The is operator takes a oneof and returns a boolean. We need a dummy node 

named z. 

{ SCO I <x r IS-a> = s <z>, and <y v BOOL> € S } 
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A.1.11 Constants 

Integer, real, and character constants have no inputs and one output The output 
must be the type of the constant ^ 

Real {SCO|<^^D£S} 

Integer { S€0 | <y v JNT> € S } 

Character { S€0 | <y v CHAR> € S } 

A.2 Type Declarations 

Variables and Formal arguments may have type information explicitly given about 

them through a type specification. The type specification is treated just like an 

expression for the purposes of typing. 

A.2.1 Basic Type Specifications 

Reals, integers, characters, booleans, and null can each be specified by their names, 

which have selectors associated with them: REAL, INT, CHAR, BOOL, and NULL 

respectively. If we see a basic type a, with selector B, then there is only one 

"output", and that is the type a. 

{ S€0 | <y v B> € S } 

A.2.2 Array and Stream Type specifications 

If we see a type specification ARRAY[A], where A is a type specification, then we 

say the "output" is an array of A, and the "input" is A. 

{ S€0 | <y v ARRAY> = s <x x > } 

Similarly for streams: 

{ S€0 | <y v STREAM> = s <Xj> } 
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A.2.3 Record and Oneof Type specifications 

If we see a record type specification RECORDfajiAj, .... a n :AJ, then we treat it 

exactly the same as the record constructor in section A.l.9.1. 

Similarly for oneof type specifications: There is no oneof constructor that specifies 

all the arms, but it should be treated like a record constructor, just replace all the 

GET- as with /S-a's: 

{ S€0 | <y v IS-a^y = s <x^> for i € {l,...,n}, 

and if <y v fS-fi, ...> € R, then fi - a, for some j } 

A.2.4 Function Type Specifications 

Function type specifications are treated just like function applications in section 
A.4.1. Instead of having subexpressions, we have subtypes. 

A.2.5 Free Variables as Type Specifications 

A free variable just names a single node, as is true for any VimVal expression. 

A.3 Basic Constructs 

A.3.1 If then else 

The if then else operator appears in the form: 

IF <expl> THEN <exp2> ELSE <exp3> ENOIF 

We require that <expl> be a boolean 1-valued expression, <exp2> and <exp3> be 
m-valued expressions, where <exp2>j is the same type as <exp3>j for i=l through 
m. The IF is a m-valued expression. 

We label the <expl> node x x v the <exp2> nodes x^ v .... x^ m , the <exp3> nodes 

* 3 j, .... x 3 m , and the result nodes y l y m . 

{ s€0 | a u , BOOL> € s, 

and <X 2 A > = s <X 3 j> = s <^> for 1 in {!,..., m} } 
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A.3.2Tagcase 

The lagcase construct appears in the form: 

TAGCASE <exp> 

TAG fltj (nj): <expj> 

TAG a 2 (n 2 ): <exp 2 > 

TAG a n (n 3 ): <exp n > 
{ OTHERWISE : <exp n+1 > } 
E NOT AG 

The requirements are that <exp 1 >..<exp n+1 > are the same type, and T(<exp>) must 
be a oneof type with a y .a n as tag values. (If the OTHERWISE is not included, 
then there must be no other tag values.) The value of a TAGCASE can be a m- 
valued expression. 

We label the node of <exp> as x Q , the nodes of <expj> as x. for j=l,...,m. The 
resulting nodes of the tagcase are y- for j = l...,m. 

If the OTHERWISE is included we have: 

{ SCO | <exp 1 > = s <y % > for i in { 1 n+1 >, 

and <exp, GET-a^> = <n^ } 

If the OTHERWISE clause is not included, add the extra restriction that there are 
noothertags: 

{ S€0 | <exp 1 > = s <^> for 1 in { 1 n+1 >, 

and <exp, GET-a^> s <n i >, 

and if Kexp.GET-fi, ...> € S, then * a 1 for some i } 
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A.3.3 Forall construct 

The foratl construct appears as: 

FORALL <var> IN [ <expl> , <exp2> ] 

CONSTRUCT or EVAL <exp3> 
ENDALL 

There are two cases, the CONSTRUCT and the EVAL case: In every case, <expl> 

and <exp2> must be integer. We label <expl>'s node x v <exp2>'s node * 2 , and 

<exp3>'s node x 3 . The result node \sy v 

A.3.3.1 Forall with CONSTRUCT 

The restrictions for the CONSTRUCT case are that if <exp3> is of type T, then y x is 

type ARRAY|T|. 

{ SCO | { <x x , INT>, <x z , INT> } C S, and <* 3 > = s <y % , ARRAY> } 

A.3.3.2 Forall with EVAL 

There are six possible "evaluation operators" for the EVAL clause of a forall 

statement In each case we have the additional restriction that the type-of exp3 must 

be the same as the type of the output 

There are more restrictions, based on which evaluation operator is used: 

In the case of +, * ,min, or max we have the restriction that <exp3> must be an 

integer or a real: 

V p € RI 
{ SCO | { <x t , INT>, <x v INT>. <x 2 , p>, <y v p> } € S } 

In the case of &, or or we have the restriction that <exp3> must be boolean. 

{ S€0 | { <x v INT>. <x 2 , INT>, <x v BOOL>, <y v BOOL> } € S } 
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A.4 Functions 

There are two ways that a function is encountered in VlMVAL. The first is the 
declaration of a function, which is first treated by the compiler to get rid of 
polymorphism and recursion. The second is when the function is passed as an 
argument (either to a built in operator such as apply, or as another function). 
A.4.1 Function Declaration 



After a function has been copied and modified to deal with polymorphism and 
recursion, the type checker sees a "function declaration" node, which we can write 
as 

FUNCTION^ a„) RETURNS (P t jBj <EXPRESSION> END FUNCTION 

where the a^s and ft's actually are node names of nodes inside <expression>. We 

assume that <expression> is m-valued, and that fi is the name of jth output node of 

<EXPRESSION>. The resulting type constraints of a function declaration is that 

the output is a function taking n values, such that the ith value is of type a- v and 

returning m values, such that the jth returned value is of type /L y t refers to the 

node of the actual function. 

{ R€0 | <y v ARG-i> = s <a 1 >. 

and <y v RET-i> = s <P^> for appropriate /'$ } 

A.4.2 Function Application 

If we see 

<exp>(<expj>, .... <exp n >) 

then we have a function application. The requirements are that <expj> be the same 

type as the ith argument of <exp>, and that the jth output of this function 

application is the type of the jth return value of <exp>. Here <exp> is labeled x v 

and <exp-> is labeled x, .. The outputs are labeled y. for appropriate values of/ 

{ S€0 | <x v ARG-i> = s <x 1+l >, and <jc 1 . RET-\> = s <^> > 
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Appendix B 
Examples of the power of VimVal 



Program example 5-3 composes two functions to give a new one: One weakness in 
our type system is that one can not write a function which takes an arbitrary number 
of arguments. (This weakness is a result of the syntax of VimVal, rather than the 
type system itself.) 
Program Example 5-3: 

function compose (F:functype(B) returns (C). 
G:functype(A) returns (B)) 
returns (functype(A) returns (C)) 
function composer (aval:A) returns (C) 
F(G(aval)) 
endfun X composer 
composer X return the composer 
endfun X compose 

Program example 5-4 implements the same function, with type inference instead. 
Program Example 5*4: 

function compose (F,G) 
function composer (aval) 

F(G(aval)) 

endfun X composer 
composer 
endfun X compose 

Program example 5-5 shows how a multiplier, the encapsulation of multiplication by 
a constant, can be implemented in VimVal: 
Program Example 5*5: 

X MakeMul takes an integer I and returns a 
X function which multiplies Integers by I 
function MakeMul(i :INT) returns (FUNCTYPE(INT) returns (INT)) 

function doIt(j:int) returns (int) i*j endfun 

dolt X return doit 

endfun 
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Program example 5-6 shows how the multiplier in example 5-5 can be written 
without explicit type declarations. This example is slightly more powerful, in that in 
can operate on reals or integers. 
Program Example 5-6: 



function MakeMul(i) 

function dolt(j) i*j endfun 

dolt 

endfun 

Program example 5-7 demonstrates a "password hider" program, which can be used 
to hide information, which will only be released upon presentation of the correct 
password. See [18] for further details on this sort of protection. 
Program Example 5-7: 

type hider=functype(givenpass:T, 

command :oneof[store:T; fetch]) 
returns (oneof[badpass; 

didstore:h1der; 
didfetch:T]) 
type pfuntype = functype(T.T) returns(boolean) 

function makePassword(password:T, 

passfun:pfuntype, 
hiddenObject:T) 
returns (hider) 
X makePassword returns a function which knows the password and knows the 
X hidden object, but will not reveal the bidden object unless the user 
X presents the correct password. There 1s also no way to uncover the 
X password itself, except by subverting the type system, e.g. using 
X a debugger (or perhaps by trial and error), 
function doIt(givenpass, command) 
X dolt is the function that is returned by makePassword. dolt 
X knows the password, because the password 1s 1n dolt's lexical 
X scope. 

X dolt returns the value iff the password presented causes 
X PASSFUN( PASSWORD, GIVENPASS) to return true, 
if ~passfun(password,g1venpass) then 

make[BadPass:n11] 
else 
tagcase o:=command 
tag store: make[DidStore:makePassword(password,passfun,o)] 
tag fetch: make[D1dFetch:hidden0bject] 
endtag 
endlf 
endfun X dolt 
dolt X return dolt 
endfun X makePassword 
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Finally, we have an example which implements lisp primitives in VlMVAL 
Program Example 5-8: 



function cons(a.b) 
make[ConsVal :record$[car:a,cdr:b]) 
endfun X cons 



X The car of null 


is 


; null 


function car(a) 






Tagcase b:=a 






tag Consval : 


b. 


car 


tag Nullval : 


a 




endtag 






endfun X car 






X the cdr of a nu 


11 


is null 


function cdr(a) 






TagCase b:*a 






tag Consval : 


b. 


cdr 


tag Nullval : 


a 




endtag 






endfun X cdr 






function nullp(a) 






is NullVal(a)' 






endfun X nullp 







function lispnil() 
make[nullval :null] 
endfun X lispnil 

function length(a) 
if nullp(a) then 
else l+length(cdr(a)) 
endif 
endfun X length 

function append(a.b) 
if nullp(a) then b 

else cons(car(a), append(cdr(a), b)) 
endif 
endfun X append 

function 1th(a,1) 

if 1>0 then 1th(cdr(a),1-l) 
else car(a) 
endif 
endfun X 1th 
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function reverse(a) 

% doreverse returns the first i elements of a in reverse 
function doreverse(a, i ) 
if i=0 then 1 ispnil () 
else cons( ith(a, i ) , doreverse(a, i-1) ) 
endif 
endfun % doreverse 
doreverse(i , length(a) ) 
endfun % reverse 



64 



Appendix C 
Listing of the VIM-VAL type checker 



This appendix contains a listing of the VIM-VAL type checker which is written in 
the CLTJ [13] programming language. The style is "functional", i.e. we have been 
careful to avoid side-effects, so that the eventual translation of the VIM-VAL 
compiler into VIM-VAL will not be too painful. 



SOTA 



SET 

EQUIVREL 

MAP 



A cluster which implements the MFSA defined in definition 4-5, 
along with its operations and the predicates which can be used to 
determine type correctness. 

A cluster which implements the mathematical object set. 

A cluster which implements equivalence relations. 

A cluster which implements maps from one set of objects to 
another set of objects. 



SOTATEST A procedure which tests SOTA. 
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lextend 

sota ■ cluster[alphabet. nodename, classname:typ»] Is 

create, equate, has_subpath_to. has_path, has_closed_path. 
close, 

get_unique_type_assignment. 
export X for debugging only 
where 

alphabet has get_class:proctype(alphabet) returns(classnaroe). 
equal :proctype(alphabet, alphabet) returns(bool), 
get_is_terminator:proctype(alphabet) returni(bool) , 
X requires: if two alphabet items A and B then 
X A.class-B.class implies A. is_terminator»B. is_terminator 
nodename has equal :proctype( nodename, nodename) returns(bool), 
classname has equal :proctype(classname, classname) returns(bool) 

abstract • sota[alphabet, nodename, classname] 
rep • struct[equivs:ERNN. 

closures :THSA, 

transitions .-tntano] 
ERNN-EquivRel[NodeName] 
TNSA-map[NodeName , SA] 
SA=Set[Alphabet] 
tntano • map[NodeName, tano] 
tano • map[alphabet, no] 
no • oneof [acceptor :nul1, 
node: nodename] 
nopair-struct[fir st, second: no] 
X nodepair'struct[fir st, second: nodename] 
agenda* set[nopa1r] 

X representation invariant I(R) 

X R.equivs agrees with R. transitions: 1-e. 

X Equivrel[NodeName]$Equivalent(R.equivs, n.m) implies 

X R.trans1tions[n] • r.transitions[m] 

X R. transitions preserves well-typeness: i.e. 

X R.transitions[n][a] and R.transitions[n][b] are defined Implies 

X a.class'b.dass 

X R. closures agrees with R. transitions: i.e. 

X If R.closures[n] is defined then 

X Domain(R.trans1t1ons[n]) Is a subset of R.closures[n] 

X abstraction function R corresponds to A Iff 

X oquivrel[NodeName]Sequivalent(r.equ1vs,n,m) iff- 

X for all Q in A, <n> is state-equivalent to <m> 

X equivrel[nodename]Sequivalent(r.equ1vs,m, 

X no$value_node(r.trans1t1ons[n][a])) Iff 

X for all in A <n.a> is state equivalent to <m> 

X no$is_acceptor(r.trans1t1ons[n][a]) Iff 

X for alJ in A <n.a> 1s in Q 

create - proc() returns(cvt) 

X returns the set of all type assignments 
return( rep${equi vs : ERNN$Create{ ) , 

closures : TNSA$create( ) , 

transit Ions :TNTANO$create()}) 
end create 

equate » proc(os:cvt, nodei,nodej:nodename) returns(cvt) slgnals(empty) 
X returns OS[nodei*nodej], (signals empty 1f there is none) 
If ERNN$Equivalent(os.equivs, nodei.nodej) then return(os) end 
ttd: agenda :«agenda$[nopair${flrst:no$make_node(nodei), 

second : no$make_node(node j )}] 
X ttd: things to do. but these things have to be checked for 
X compatability AND put into the equivrel 

newequivs:ernn:»os.equ1vs 
while -agenda$1s_empty(ttd) do 

nowdo:nopair 

nowdo,ttd:»agenda$p1ck_rest(ttd) 

X the first thing to check is previous equivalence. If they 
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X are already equivalent, then we don't need to add more. 
X after that, we should check for compatibility. The class of the 
X labels on the output transitions should be the same. Ve really 
X only need to test one of them from each node. 
X After that, we should gather a list of the ones that should be 
X made if these two nodes are equivalent. This really must be 
X done for the whole class of them 

If no$is_node(nowdo. first) cand no$is_node(nowdo. second) than 
nowdol:nodename:*no$value_node(nowdo. first) 
nowdo2:nodename:-no$value_node(noMdo. second) 
If ~ernnSequivalent(newequ1vs, nowdol, nowdoZ) then 

X now we actually have to equate them, but are they compatible? 
1f ~compatible(os. nowdol, nowdo2) 
than signal empty end 
/ " X we must go to the mapping and add stuff 
/ ttd:«ttd | pairs_which_must_be_same(os. newequ1vs[nowdol], 

newequivs[nowdo2]) 
newequivs:«ernn$equate(newequivs, nowdol, nowdo2) 
end 
elself no$is_node(nowdo. first) cor no$is_node(nowdo. second) 
then signal empty end 
end 
X built up newequivs, but not done yet 
X now we have to actually create the new object to return 
X we must extend the old maps 

X (not because newequivs does not partition everything correctly, 1t 
X does, but because we can only get the non_tr1vi*al_c7asses out, and 
X that is not everything) 
rettrans:tntano: «os. transitions 
retcl os : tnsa: «os . closures 

for eclass:set[NodeName] In ernn$non_tr1vial_classes(newequivs) do 
everclosed:boo!:«false X did we ever hit a closure for this class? 
th1stran:tano:»tano$create() 
thisclose:sa:-sa$create() 
for e1t:nodename In set[nodename]$elements(eclass) do 

for al :a1phabet,n:no 1n tano$entries(os.trnnsitions[elt]) do 
thistran:*tano$def ine_override(thistran,al,n) 
end except when undefined: end 
begin 

If everclosed then 

thisclose: s thisclose&os.c1osures[elt] 
else 

thisclose:-os.dosures[elt] 
everc1osed: s tru« 
end 
end X this Is so we can keep track of if ire closed ft 
except when undefined: end 

- - end - ••-- ■-■ - ■ -■• ----- -- 

X check for the closure restriction one last time 

1f everclosed cand ~tano$do(na1n_1s_in(th1stran l th1se1ose} 

then signal empty end X not an error 1f never closed 
for elt:nodename In set[nodename]$elefflents(ec1ass) do 

rettrans:"tntano$define_overr1de(rettrans,elt,th1stran) 
If everclosed then 

retclos:«tnsa$define_overr1de(retc1os,elt,th1sclose) 
•nd X don't define unless we actually closed It 
•nd 
end 
return(rep${equivs: newequivs, 
closures:retdos, 
transitions :rettrans}) 
end equate 

X internal routine decides if two nodes are compatible. 

X does check the closure condition 

X we have to do 1s look at a rep from the domain of the transitions to 

X see if they are the same class. If there is none, then its ok on this. 

X we also have to check the clsoure condition 

X check that both of these are true: 



6 7. 



ps:<kuszmaul. thesis. valclu>sota.clu. 62 28 April 1984 12:36:01 Page 3 



X os.closures[nl] is undefined or contains domain(os.transitions[n2]) 
X os.clusures[n2] is undefined or contains domain(os.transitinos[nl]) 
X if they are both defined then this is equivalent to testing 
X if the intersection of the closure conditions contains the union 
X of the domains (This is equivalent because we already knew 

X that the closures contained the domain of their own functions 

compatible ■ proc(os:rep, nl,n2:nodename) returns(bool) 
tl : tano: «o.s . trans itions[nl] 

except when undefined: tl:*tano$create() end 
t2: tano :»os. trans itions[nZ] 

except when undefined: t2:"tano$create() end 
1f tano$pick_from_domain{U). class -• 
tano$pick_from_domain(t2). class 
then return(false) end 
except when none: end X ok so far 
begin 

cl:sa:-os.closures[nl] 

If -tano$domain_is_in(t2,cl) then return (false) end 

end except when undefined: 

end X it is ok if os.closures[nl] is undefined 
begin 

c2:sa:«os.closures[n2] 

1f ~tano$domain_is_in(tl,c2) then return(false) end 
end except when undefined: end X it is ok 
return(true) 
end compatible 

X internal routine which returns a set of pairs that must be the same 
X if the elements of si and s2 are to be the same under a modified OS. 
X the reason we don't accept the union of si and s2 1s that me 
X would have to return all the pairs in (S1IS2) CROSS (S1{S2), 
X which is no fun. 

X this way, we won't have to return any such pairs, which speeds things up 
X (of course, we can if we want to, no guarantees here.) 
X the pairs that we return are the ones where 
pairs_which_must_be_same * proc(os:rep, sl,s2:set[nodename]) 

returns (agenda) 
X let s:»sl union $2 
X for each element in a 
X 

X for each element in si 
retset: agenda :-agenda$create() 
sn:soquenco[nodename]:*set[nodename]$set2seq(sl) 

|| $et[nodename]$set2seq(s2) 
for 1:1nt in sequencer. nodename]$1ndexes(sn) do 
thisname:nodename:*sn[i] 
thistano:tano:*os.trans1tions[th1sname3 

except when undefined: thistano:-tano$create() end 
for symbol: alphabet In tano$domain_1ter(thistano) do 

for j:1nt 1n 1nt$from_to(1+l,sequence[nodename]$s1ze(sn)) do 
thatname : nodename : *sn[ j ] 

X add what you get if you follow SYMBOL from thisname and thatname 
retset: *retset+ 

nopal r${f1rst: os. trans1t1ons[thisname][symbo1], 
second : os. transit ions[thatnam«][symbol]} 
except when undefined: 

end X if a symbol is not there, don't worry 
end 
end 
end 
return( retset) 
end pairs_wh1ch_jnust_be_same 

X if has_subpath exists, then we would like this to mean the same thing as 

X a: rep.b: nodename : m has_subpath(os , node_from, sym) 

X return(equate(a,b,node - to)) 

X but we don't use the intermediate node name 

X note that in any event, if node^from. is.terninator then signals terminator 
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has_subpath_to ■ proc(os:cvt, node_from: nodename, sym:alphabet, node_to:nodename) 
returns(cvt) signal 5 (empty, terminator) 
X if sym is a terminator, then node_to would have to be an acceptor, 
X which is impossible 
it sym. is_terninator than signal terminator end 

X worry about closure first 

If ~sa$elementof(sym,os.closures[node_from]) 

then signal empty end 

except when undefined: end X its ok 

nmap: tano: a os. trans it ions[node_from] 

except when undefined: nmap:»tano$[] end 

X check for the class restriction 

it tano$pick_from_domain(nmap).class-"sym. class 

then signal empty end 

except when none: end X it is ok 

already_to: no :-os. trans it1ons[node„from][sym] 
except 

when undefined: 

X just build the new object and return it 
return( rep${equi vs : os . equl vs , 

closures: os. closures, 
transitions: 
tntanoSdef ine_override( 
os. transitions, 
node_frora, 
tanoSdef 1ne_overr1de( 

nmap, sym, no$make_node(node_to)))}) 
end 

X if it is an acceptor, it can't equate to node_to 
If no$is_acceptor(already_to) then signal empty end 

nat : nodename : «no$val ue_node( al ready_to) 

If ernn$equ1va1ent(os.equivs,nat,node_to) then 

return(os) 

•nd 

X it is defined, and it meets the closure condition, but the node 
X is not equivalent. Checks again to see if meets the class property 
X Inside equate 

return(down(equate(up(os),nat.node_to))) reslgnal empty 
end has_subpath_to 

X has_subpath does the following; 

X if os. trans it ionsf node ][ sym J is defined, returns ot 

X otherwise, checks to see if the transitions that are already there 

X are compatible with sym (if not signals empty) 

X then creates en anonymous node which Is transitioned to 

X there if sym. is.terminator then 

X returns the nodename that we go to on sym 

Xhas_subpath ■ proc(os:cvt, node: nodename, sym: alphabet) 

X returns(cvt, nodename) signal s(empty) 

X has.su'bpath Is not actually a defined function 

X end has_subpath 

X has_path adds path <node,sym> to the transitions 
X If ~sym.1s_terminator then you get "non_terminator" signalled 
X 1f sym is Incompatible with the current version, signals "empty" 
X it could either be Incompatible with the closure 
X or the transition class 

has_path * proc (os:cvt, node: nodename, sym.-alphabet) 
returns(cvt) signal s( empty, non_term1nator) 

X check for a terminator 

it -sym.is_terminator then signal non_terra1nator end 
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X check for the transition already existing, if it does then just 
% return os because it is guaranteed to be an accepting node, because 
X sym. is_terminator is true 

1f tano$defined(os.transitions[node],sym) then return(os) end 
except when undefined: end X it is ok 

% check the closure condition 

if ~sa$ElementOf(sym,os..closures[node]) then signal empty end 
except when undefined: end X it is ok 

X check the transition compatiblity 

1f tano$p i ck_f rom_doma in (os. trans it ions [node] ).class~-sym. class 

then signal empty end 
except when undefined: 

when none: 

end X it is ok 

X now return the new object 
old_tano: tano:=os. transit ions [node] 

except when undefined: old_tano:-tano$create( ) end 
new_tano:tano:«tano$define_override{old_tano,sym, 

no$make_acceptor(n1l)) 
new tntano:tntano:«os. transitions 
for affected:nodename 1n set[nodenaine]$elements(os.equivs[node]) do 

newtntano: 'tntanoSdef ine_override( newtntano, affected, new_tano) 

end 
return(rep$replace_trans it ions ( os. newtntano)) 
end has_path • 

% if os can't meet the closure condition, then signal empty 
X otherwise return os with the new closure condition 
close ■ proc(os:cvt, node:nodename, syms:set[alphabet]) 
returns(cvt) slgnals(empty) 
1f ~t ano$doinain_is_in( os. transitions [node], syms) 
then signal empty end 
except when undefined: end X it is ok 
X now create the new os 
isyms:sa:=syms&os.closures[node] 

except when undefined: isyms:-syms end 
1f sa$is_empty(isyms) then signal empty end 
retclosures:tnsa: «os. closures 
eclass:set[nodename]:«os.equivs[node] 

except when undefined: eclass:»set[nodename]$[node] end 
X all the equivalent nodes should have equal maps 
for ntofix:nodename 1n set[nodename]$elements(edass) do 

retclosures:-tnsa$def ine_override(retclosures,ntofix,1syms) 

end 

return(rep$replace_closures(os,retclosures)) 
end close 

X has_closed_path does close(has_path(os, node, sym) .node, {sym}) 
has_closed_path • proc(os:abstract, node:nodename, sym:alphabet) 
returns(abstract) slgnals(empty) 
return(close(has_path(os, node, sym), node, sa$[sym])) 

reslgnal empty 
end has_closed_path 

X returns the map, which describes the transition function for the 

X fsa which accepts the type assignment. 

X signals ambiguous if any of the nodes named dont have some transition 

X leading away from them. Nodes can be named in closures, equivs, or 

X they could have transition functions which are undefined everywhere 

X also signals ambiguous if the closure of a node is not exactly 

X equal to the domain of the of the transition function. This 

X has two special cases: 

X 1) a node does not have a closure (nodes 

X without a closure are ambiguous) 

X 2) a node has a closure, but some element of the closure does not 

X have a transition. 
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X (we are guaranteed that the domain of the transition is in the closure) 
get_un1que_type_assigninent ■ 
proc(os:Cvt) 
returns(map[nodename.map[alphabet.oneof [acceptor: null, node :nodename]]]) 

slgnals(ambiguous) 

X check for ambiguity by finding mentioned nodes that are never used 
X several ways for it to be ambiguos: an entry could have a tano 
X with no entries, or there could be a named node somewhere with 
X not entry in transitions 

X or there could be a node mentioned in the closure that has no entry 
X in transitions 

% or there could be a node named in equivs with no entry in 
X transitions 

tor nname:nodename,ntano:tano 1n tntano$entries(os. transitions) do 
/i if the named node does not have a closure then ambiguous 
myclosure:sa:"os.closures[nname] 

except when undefined: signal ambiguous end 
X if any of the symbols in the closure don't have a transition 
X then ambiguous 

tor symindom: alphabet 1n sa$elements(myclosure) do 
tano$fetch(ntano, symindom) 
end 
except when undefined: signal ambiguous end 
X if the named node has a completely undefined transition 
X function then then ambiguous 
tano$pick_from_domain(ntano) 

except when none: signal ambiguous end 
X if any of the nodes in range of the transition 
X dont have closures or have undefined transition 
X functions then ambiguous 

tor sym:alphabet,nrslt:no 1n tanoSentries(ntano) do 
tagcase nrslt 

tag acceptor: X do nothing 
tag node(nto:nodename): 

XX X if the node does not have a closure then it 1s 

XX X ambiguous 

XX tnsa$fetch(os. closures. nto) 

XX except when undefined: signal ambiguous end 

XX note: all the nodes are checked for this 

X if there is no transition from nto, to another node 
X it is ambiguous 
tano$pick_from_domain(os.transit1ons[nto]) 

except when undefined, none: signal ambiguous end 
•ad 
end 
•nd 
X if any of the nodes mentioned 1n the equivalence classes 
X dont have closures or have undefined transitions 
X then ambigous 

tor nt_cl asses :set[nodename] 1n ernn$non_tr1v1al_classes(os. equivs) do 
for mentioned :nodename In set[nodename]$elements(nt_c1asses) do 
tnsa$fetch( os. closures, mentioned) 

except when undefined: signal ambiguous end 
tano$pick_from_domain(os.trans1tions[ment1oned]) 

except when undefined, none: signal ambiguous end 
•nd 
•nd 
X it any of the nodes mentioned in the closures 
X dont have closures or have undefined transitions 
X then ambiguous 

tor c_node:nodename 1n tnsa$domain_1ter(os. closures) do 
tnsa$f etch(os . closures . c_node) 

except when undefined: signal ambiguous end 
tano$pick_from_domain(os.transitions[c_node]) 

except when undefined, none: signal ambiguous end 
•nd 
return( os. transitions) 
•nd get_unique_type_assignment 
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% export returns a copy of the internal representation for os 

X note that since everything is functional, this is perfectly safe 

export = proc(os:cvt) returns(rep) 

return(os) 

end export 
end sota 
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lextend 

set - c1uster[t:type] 1« 

create, new, X these are the same 

add, X add a new element 

contains, gt, X these are the same 

elementof, X other direction for contains 

mem, X does some element of a set satisfy a predicate 

elements, cons, pick, pick_rest.is_empty, X mfic 

equal, X are they the same set? 

union, or, X these are the same 

intersection, and, X these are the same 

sub, X set subtraction 

set2seq 

where t hat equal :proctype(t,t) returns (boo!) 

rap - sequence[t] 

X create the empty set 

new • proc() returns(cvt) return(rep$[]) and new 

X add an element 

add * proc(s:cvt, el:t) returns(cvt) 

1f up(s)>el then raturn(s) else return(rep$addh(s,el)) and 

and add 
X Iow:int:-l 
X high: int.- -replsizefs) 
X while low<'high do 
X i:int-.'(1ow+h1gh)/2 

X if S[i]-el then return(s) 

X elseif s[i]<el then low-.'1+l 

X else high:*1-l 

X end 

X end 

X return (rep$subseq(s,l,h1gh) 
X II repS[el] 

X II rep$subseq(s.low,rep$size(s)-h1gh)) 

X end add 

X membership operator 

gt ■ proc(s:cvt, e!:t) returns(bool) 

for elin:t 1n repSelements(s) do 

1f elin»el than return (true) and 
and 

return(falta) 

X low: int. -«1 

X h1gh:1nt:*repSs1ie(s) 

X while low<*h1gh do 

X i:1nt:*(low+h1gh)/2 

X if sffJ«eJ then retum(true) 

X elseif s[1]<el then 1ow:*1+l 

X else high-.'1-l 

X end 

X and 

X return (false) 

•nd gt 

X the other name for the membership operator 
contains ■ proc(s:set[t], e1:t) returns(bool) 

return(s>el) 

and contains 

X the other direction for the membership operator 
elementof ■ proc(el:t. s:set[t]) raturns(bool) 

raturn(s>el) 

and elementof 

X return true Iff there 1s an element K in S, such that PRED(EL.K) 

mem > proc(el:t, s:set[t], pred:proctype(t,t) returns(bool)) raturns(bool) 
for knownel:t 1n set[t]$elements(s) do 

if pred(el.knownel) than return(true) and 
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end 

return(false) 
end mem 

elements * 1ter(s:cvt) y1elds(t) 

for e:t 1n repSelements(s) do y1eld(e) end 
and elements 

cons • proc(s:sequence[t]) returns(set[t]) 
retval:set[t]:-set[t]S[] 
for e:t in sequence[t]$elements(s) do 

retval :»retval+e 

•nd 
return(retval) 
end cons 

pick ■ proc(s:cvt) returns(t) slgnals(empty) 

return(s[l]) except when bounds: signal empty and 
and pick 

pick.rest • proc(s:cvt) returns(t.cvt) signals (empty) 
raturn(s[l],rep$rem1(s)) 

except when bounds: signal empty and 
and pick.rest 

is.empty • proc(s:cvt) returns(bool) 
return (rapSempty(s)) 
and is_empty 

X two sets are the same if they have exactly the same elements 
equal • proc(sl,s2:cvt) raturns(bool) 

1f sl«s2 then raturn(true) and X might as we77 optimize 

1f rep$size(sl)~«rep$size(sZ) then return(falsa) and 

for el:t in elements(up(sl)) do 

1f up(s2)~>el than raturn(falsa) and 
end 

X everything in s2 is in si, and they are in 1-1 correspondence, so 

return(trua) 

and equal 

or ■ proc(sl,s2:set[t]) returns(set[t]) 
for el:t 1n elements(sl) do 
s2:«s2+el 
•nd 
raturn(s2) 

X sizel:1nt: m repSsize(sl) 
X sizeZ:int:Tep$size\s2) 

X retval:array[t]:'array[t]$predict(l,sizel+s1za2) 

X indxl.int.--I 

X indx2.-int.--l 

X while indxK-sizel cand 1ndx2<*size2 do 
X if sl[indxl]*s2[indx2] then 

X array[t]Saddh( retval, sl[indxl]) 

X indxl: m indxl+l 

X indx2:'1ndx2+l 

X elseif sl[indxl]<s2[indx2] then 

X array[t]Saddh(retval,sl[indxl]) 

X indxl:-indxl+l 

X else 

X array[t]$addh(retval ,s2[indx2]) 

X indx2:-indx2+l 

X end 

X end 

X X one of the indx's is over 

X if indxKsizel then 

X return(rep$a2s( retval ) 1 1 rep$subseq(sl , indxl+l.sizel-indxl)) 

X elseif indx2<size2 then 

X return(rep$a2s(retval) I Irep$subseq(s2, indx2+l,size2-indx2)) 

X else return(rep$a2s(retval)) 
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X end 

end or 

union ■ proc{sl,s2:set[t]) return*(set[t]) return(sl|s2) end union 

and • proc(sl,s2:set[t]) raturns(set[t]) 
retset:set[t]:-set[t]$[] 
for e1:t 1n elements(sl) do 

1f s2>el then retset:-retset+el end 
•nd 
return(retset) 

X sizel:int:Tep$size(sl) 

X size2:int:TepSsize(s2) 

X retval:array[t]:=array[t]$predict(l, int$min(sizel,size2)) 

X indxl:int: m l 

X indx2:int: m l 

X while indxK'sizel cand indx2<'Size2 do 

X if Sl[indxl]-s2[indx2] then 

X array[ t]$addh(retval ,sl[ indxl]) 

X indxl:-indxl+l 

X indx2:'indx2+l 

X elseif sl[indxl]<s2[indx2] then indxl :-indxl+l 

X else indx2:-indx2+l 

X end 

X end 

X return(rep$a2s(retval)) 

end and 

1ntersection«proc(sl,s2:set[t]) raturnt(set[t]) return(sl&s2) and Intersection 

sub ■ proc(sl,s2:set[t]) returns(set[t]) 
retset:set[t]:-set[t]$[] 
for el:t 1n elements(sl) do 

1f s2~>el than retset:«retset+el and 
and 
X sizel:int:TepSsize(sl) 

X size2:int: m rep$size(s2) 

X retval:array[t]:*array[t]$predict(l,int$m1n(sizel,sizeZ)) 

X indxl:int:*l 

X 1ndx2:int:*l 

X while indxK'sizel cand indx2< m size2 do 

X if sl[1ndxl]'s2[indx2] then 

X indxl:-indxl+l 

X indxZ:'1ndx2*l 

X elseif sl[indxl]<s2[1ndx2] then 

X array[t]$addh(ratval,sl[1ndxl]) 

X Indxl :-1ndxl+l 

X else 1ndx2:-1ndx2+l 

X end 

X end 

X 1f indxKsizel then 

X return(rep$a2s(retval)Urep$subsoq(sl,1ndxl+l,sizel-indxl)) 

X else return(rep$a2s(retval)) 

X end 

and sub 

set2seq ■ proc(s:cvt) raturns(sequenca[t]) 
return(s) 
and set2seq 

create • proc() returns(cvt) 

return(rep$new()) 

and create 
and set 
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•extend 

equivrel ■ cluster[T:typ»] Is 

create, equate, non_trivial_cl asses, fetch, equivalent, cons, new, equal 
where T has equal :proctype{t, t) returns (bool) 
X this is immutable 
rep'map[T,set[T]] 

% abstraction function A(e:rep). if e[x] is defined, then x 1s in 
% the class with elements of e[x]. If e[x] is undefined, x is in { x } 
X by itself 

X rep invariant R(r:rep) if r[x] is defined then lr[x]l>l and 
X x is in r[x], and for all y in r[x] r[y] is defined 

X return an equivalence relation with no relations. 
X every element of T has it's own class 
create * proc() returns(cvt) 

return(rep$create()) 

end create 

X create an equivalence relation with the added relationship vali m valj 
equate • proc(er:cvt.vali,valj:T) returns(cvt) 

1f set[T]$ElementOf(val j.up(er)[val1]) then return(er) end 
newel ass :set[T]:«set[T]$Union(up(er)[val1],up{er)[va1j]) 
for affected:T In set[T]$eleraents(newclass) do 
er:»rep$define_override(er, affected, newel ass) 
•nd 
return(er) 
end equate 

X yield all the classes which have more than one element in them 
X watch outl This does not yield all the classes because there 
X is no way to generate a complete 11st of T. Anything 
X not yielded is in its own class 
non_tr1vial_dasses ■ 1ter(er:cvt) y1«1d$(set[T]) 
did:set[T]:«set[T]$create() 
for elt:T,1:set[T] In repSentries(er) do 
If ~set[T]SElement0f(elt,d1d) then 
did:-did+elt 
y1tld(1) 
•nd 
•nd 
•nd non_tr1v1al_classes 

X returns the class that val is In 

fetch - proc(er:cvt, val:t) returns(set[t]) 

X If t is not defined, then return set[t]$[val] 
raturn(er[val]) except whan undefined: r«turn(set£t]$[va1]) and 
•nd fetch 

X If vail 1s In er[valj] then return true, else false 
equivalent • proc(er:equivrel[T], val1,valj:t) returns(bool) 

return(set[T]$element0f(val1,er[valj])) 

•nd equivalent 

new • proc() r*turns(equ1vrel[T]) return(create()) and new 

cons ■ proc(ss:sequence[set[T]]) returns(cvt) signal s(not_well_def 1n«d) 
ret : rap : T«pScreate( ) 

for cl:set[T] 1n sequence[set[T]]$elements(ss) do 
for el:T In set[T]$e1ements(cl) do 
ret:*rep$def1ne(ret,e! ,cl) 

except whan already_def Inad: signal not_well_def Ined and 
•nd 
•nd 
raturn(ret) 
•nd cons 

X this depends on the fact that there are no singletonsl 
equal > proc(a,b:cvt) raturns(bool) 



7b 



ps : (kuszmaul . thesis . valclu>equivrel .clu . 14 21 April 1984 14:00:57 



Page Z 



return(a=b) 
end equal 
end equivrel 
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map • cluster[domain,range:typ«] It 
create, fetch, 

define, define_overr1de, cons. new. 
defined, 

pick_from_domain,domain_is_in,domain_iter, entries, 
equal 

where domain has equal :proctype(domain. domain) returns(bool). 
range has equal :proctype( range, range) returni(bool) 
rep » oneof[empty:null, 

onedefined:entry] 
entry « struct[d: domain, 
r: range, 
rest:map[domain, range]] 

X returns a function which is undefined everywhere 
create • procQ returns(cvt) 

return(rep$make_empty(n11)) 

end create 

% if fun(x) is defined, then fun(x) is returned, else signals undefined 
fetch « proc(fun:map[domain, range], x:domain) returns(range) 
slgnals(undefined) 
for d: domain, r: range 1n entries(fun) do 
1f d-x then return(r) end 
end 
signal undefined 
end fetch 

% if fun(x) is defined, then returns true, else falsa 
defined ■ proc(fun:map[domain. range], x:domain) re turns (bool) 

fetch(fun.x) except when undefined: return(false) and 

return(true) 

end defined 

X if fun(x) is defined to be different from f_fo_x. 
X then signals already_de fined 

X otherwise, returns a function which is the same as fun, except that 
X it is defined to be f_of_x at x. 

define ■ proc(fun:map[domain, range], x:doma1n, f_of_x: range) 
returns (cvt) signal s(already_def1ned) 
1f fun[x]-f_of_x then return(down(fun)) 
else signal already.defined end 
except when undefined: 

return(repSmake_onedef 1ned( 

entry${d:x,r:f_of_x, rest: fun})) 
•nd 
•nd define 

X an internal routine which signals SAME if fun(x;-f_of_x 
X if fun(x) is undefined signals undefined 

X and otherwise returns a function which 1s the same as fun, except that 
X fun[x]-f_of_x 

do_define_override ■ proc(fun:cvt, x:doma1n, f_of_x:range) 

returns(cvt) s1gnals(same, undefined) 
tagcase fun 

tag empty: signal undefined 
tag onedefined(e:entry): 
1f e.d-x then 

If e.r«f_of_x then signal same 
else return(rep$make_onedef1ned{ 

entry$replace_r(e,f_of_x))) 
•nd 
•1st return(rep$make_onedef1ned( 

entry$replace_rest(e,do_def1ne_overr1de( 

e.rest.x,f_of_x)))) 
reslgnal same, undefined 
•nd 
•nd 
•nd do_def1ne_override 
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X returns fun, except that it is defined to be f_of_x at x 
X this overrides any old defns that fun had 

def 1ne_override ■ proc(fun:map[domain, range], x: domain, f_of_x:range) 
r»turns(map[domain, range]) 
X we must get rid of the previous definition, so we can't do it 
X smoothly by just consing a new thing onto the head 
return ( do_def i ne_over r i de( f un , x , f_of_x ) ) 
except when same: return(fun) 
when undefined: 

return(up(rep$make_onedef 1ned( 

entry${d:x,r:f_of_x,rest:fun}))) 
end 
end def ine_overr1de 

new ■ proc() returns(map[domain, range]) return(create()) end new 

cons • proc(ents:sequence[struct[d:domain,r:range]]) 
returns(map[domain, range]) 
signalf(not_we11 .defined) 
en«struct[d:doraain,r: range] 

ret :map[ domain, range] :«map[domain,range]$create() 
for e:en in sequence[en]$elements(ents) do 
ret:»def ine(ret,e.d,e.r) 

except when already.defined: signal not_we11_def ined end 
end 
return(ret) 
end cons 

X if fun is undefined forall values then signals (none). 
X else returns a value for which fun is defined 
pick_from_domain • proc(fun:cvt) returns(domaln) signals(none) 
tagcase fun 

tag empty: signal none 
tag onedefined(erentry): return(e.d) 
•nd 
end pick_from_domain 

X if domain(fun) is in superdomain returns true, else false 
domain_is_in ■ proc(fun:map[domain, range], superdomain :set[doma1n]) 
returns(bool) 
for d:domain in domain_1ter(fun) do 

If ~set[domain]$ElementOf(d, superdomain) than return(false) end 
•nd 
return(true) 
•nd domain_1s_1n 

X yields «?7 the values in domain(fun) 

domain_1ter ■ 1ter(fun:map[doma1n, range]) ylelds(domain) 
for d:domain,r:range 1n entrles(fun) do 
y1eld(d) 
•nd 
•nd doma1n_1ter 

X yields the pairs (d.r) where r-fun[d], and d 1s in the domaln(fun) 
entries ■ 1ter(fun:cvt) y1«1ds(domain, range) 
while (true) do 
tagcase fun 

tag empty: return 

tag onedef1ned(e:entry): yield{e.d,e.r) fun :-down(e. rest) 
•nd 
•nd 
•nd entries 

equal ■ proc(fl,f2:map[domain, range]) returns(bool) 
d: domain r: range 
begin 

for d.r 1n entries(fl) do 

1f f2[d]—r then return(falsa) end 
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end 

for d,r 1n entries(f2) do 

1f fl[d]~=r then return(f al se) end 
end 
end 
except when undefined: return(f alse) end 
return( true) 
end equal 
end map 
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iCextend 
alphabet-struct[class:str1ng, 

i s_termi nator : bool , 

name: string] 
nodename a 1nt X we will use negatives if we need dummy's 
classname-ttring 

v imso t a" so ta[ alphabet , nodename , cl assname] 
traap»map[alphabet,oneof [acceptor: null, node :nodename]] 
vimsotarep'Struct[equivs:ernn, closuresctnsa, transitions: tntano] 
ernn«equivrel [nodename] 
snn*set[nodename] 
tnsa • map[nodename,sa] 
sa*set[alphabet] 
tntano*map[nodename,tano] 
tn_ent»struct[d:nodename,r:tano] 
ta_ent-$truct[d:alphabet.r:no] 
ts_ent*struct[d : nodename, r : sa] 
tano*map[al phabet.no] 
no«ontof[acceptor:null , node:nodename] 

X this routine does some testing on the sota 
sotatest • proc() 

vsr-vimsotarep 

putl*stream$putl 

po: stream: -stream$primary_output() 

X first test, do a create, and get the rep which should be totally empty 

s_create:v1msota:"vimsota$create() 

sexpect("s_create",s_create, 

vsr${equivs:ernn$[], closures:tnsa$[], trans1t1ons:tntano$[]}, 
true) 

X now we have really tested the create out. That really only 

X gives us a little confidence in the lower level objects, 

X since create is so simple. 

noa:no:»no$make_acceptor(n11) 

nol : no : •no$make_node( 1) no2 : no : »no$make_node( 2) no3 : no : «no$make_node(3) 

no4 : no : ■no$make_node(4) no6 : no : »no$make_node( 5 ) no6 : no : •no$make_node(6) 

a_int:alphabet:>alphabetS{class:"INT n , is_terminator:TRUE, name:"I*T") 

a_string: alphabet:" 

alphabet${class:"STRING". 1s_terminator:TRUE, name: "STRING"} 
a_real:alphabet:«alphabet${class:"REAL", 1s_term1nator:TRUE, nana: "REAL") 
a_array: alphabet:" 

alphabet${class:"ARRAY", 1s_termi nator: FALSE. name:"ARRAY"} 
a_geta: alphabet:* 

alphabet${class:"STRUCT". 1s_term1nator: FALSE, nana:"GET_A"} 
a_getb: alphabet:" 

alphabet${class:"STRUCT". 1s_term1nator: FALSE, name:"GETJB"} 
a_getc: alphabet:" 

alphabet${cl ass: "STRUCT". 1s_terro1 nator: FALSE. nam»:"GET_C"} 

X Tets try equating tiro nodes. He should then get an ambiguous error 
X if we try to get the typemap 
s_le2:v1msota:-vimsota$equate(s_create,l,2) 
sexpect ( "s_le2" , s_le2 , 

vsr${equivs:ernn$[snn$[l,2]]. 

closures :tnsa$[], trans 1tions:tntano$[]}, 

false) 
X the transitions and closures should be completely undefined 
X the equivclass should have exactly {1,2} 1n ft 

X try something really fancy: 

X a real problem: HI is an array of nZ 

t HZ 1s an 1nt 

X does it work? 

X 

X HI • ARRAY [HZ] 

X HZ - ARRAY[H1] 

X does 1t work? 

X 
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X Nl ' INT 

X H2 • ARRAY 

X does it not work? 

X 

X Nl * ARRAY[N2] 

X N3 • ARRAY[N4] 

X Nl * N3 

X does it work? 

X 

X Nl • ARRAY[N2] 

X N3 • ARRAY[N4] 

X N2 • INT 

X N4 • STRIN6 

X Nl ' N3 

X does it not work 

X 

X Nl « CL0SED_STRUCT[A:N2.B:N3] 

X N2 • INT 

X N3 ' STRING 

X Nl • OPEN_STRUCT[A:N4] 

X does it work 

X 

X Nl • CLOSED_STRUCT[A:N2,B:N3] 

X N2 * INT 

X N3 • STRING 

X Nl - OPEN_STRUCT[C:N3] 

X does it not work 

X 

X Nl • CLOSED_STRUCT[A:n2.B:N3] 

X N2 • INT 

X N4 • CLOSED_STRUCT[A:n5:b:n6] 

X n6 * STRING 

X Nl • N4 

X does 1t work 

X 

X that pretty well tests the closure with equates 

X now for some recursion 

X Nl • ARRAY[N1] 

X deos ft work? 

X 

X Nl • ARRAY[N2] 

X N2 • ARRAY [Nl] 

X does ft work? 

X 

X Nl • CL0SED_STRUCT[a:N2, b:N3] 

X N2 ' CLOSED_STRUCT[a:N2. b:N4] 

X N4 * Nl 

X ._■_.... does it workt — 

X 

X Nl - CL0SED_STRUCT[a:H2, b:N3] 

X N2 • CLOSED_STRUCT[a:Nl, c:N3] 

X does ft work? 

X 

X Nl ' CL0SED_STRUCT[A:N2, B:N3] 

X N2 « Nl 

X N3 ' N2 

X N3 * CL0SEDJSTRUCT[A:N2. C:N3] 

X does ft not work? 

X 

X test the terminators to see if ft won't aJ7ow nas_patn to be a non-termfnator 

X Nl - ARRAY[N2] 

X does it not work (ambiguity) 

X the comments are repeated: 

X try something really fancy: 

X a real problem: Nl is an array of n2 

X N2 is an fnt 

X does ft work? 
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X 

evrvsr: >vsr${equivs:ernn$[], 

c1osures:tnsa$[ts_ent${d:l,r:sa$[a_array]}, 
ts_ent${d:2.r:saS[a_1nt]}]. 
transitions :tntano$[ 

tn_ent${d : 1 , r : tano$[ta_ent${d : a_array . r : no2}]} , 
tn_ent${d:2.r:tano$[ta_ent${d:a_1nt,r:noa}]}]} 
sexpect("N2«INT,Nl-ARRAY[N2]". 
vimsota$has_subpath_to( 

vimsota$close(vimsota$has_c1osed_path(s_create,2,a_int), 

l.sa$[a_array]), 
l,a_array,2). 
ev, true) 
sexpect("Nl-ARRAY[N2],M2-INT". 

vimsota$has_closed_path( 

vimsota$close(vimsota$has_subpath_to(s_create,l,a_array.2), 

l,sa$[a_array]), 
2.a_int), 
ev.true) 

X Nl • ARRAY[N2] 

X H2 • ARRAY[N1] 
X does it work? 

sexpect("Nl-ARRAY[N2],N2«ARRAY[Hl]", 
vimsota$c1ose( 

vimsota$has_subpath_to 
(vimsota$close( 

vimsota$has_subpath_to(s_create,l,a_array.2), 
l,sa$[a_array]), 
2,a_array,l), 
2,sa$[a_array]), 
vsr${equivs:ernn$[], 

closures :tnsa$[ts_ent${d:l.r:sa$[a_array]}. 
ts_ent${d:2.r:sa$[a_array]}], 
transitions: 

tntano$[tn_ent${d: 1 , r : tano$[ta_ent${d: a_array ,r :no2}]} . 
tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:nol}]}]}, 
trut) 

X HI * INT 

X HZ - ARRAY[N1] 

X does it work? 

sexpect("Nl-INT,N2-ARRAY[Nl]", 

v1msota$close(vinisota$has_'subpath_to( 

vimsotaShas_closed_path(s_create,l,a_1nt), 
2.a_array,l). 
2,sa$[a_array]). 
vsr${equ1vs:ernn$[], 

closures :tnsa$its_ent${d:l,r:sa$[a_int]}. 

ts_ent${d : 2 , r : sa$[a_ar ray]}] , 
transitions: 
tntano$[ tn_ent${d : 1 , r : tano$[ta_ent${d : a_1nt , r : noa}]} , 

tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:nol}]}]}, 
tru«) 

X same thing without the closure on the 1nt 
sexpect("Nl-NC_INT,N2-ARRAY[lll]", 

vimsota$close(v1msota$has_subpath_to( 

vimsota$has_path(s_create,l,a_1nt), 
2,a_array.l), 
2,sa$[a_array]), 
vsr${equivs:ernn$[], 

c1osures:tnsa$[ts_ent${d:2,r:sa$[a_array]}], 

transitions: 

tntano$[tn_ent${d:l,r:tano$[ta_ent${d:a_int,r:noa}]}, 

tn_ent${d : 2 . r : tano$[ ta_ent${d : a_array , r : nol}]}]} , 
false) 

X same thing, without the closure on the array 
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sexpect("NWNT,N2-NC_ARRAY[Nl]", 
vimsota$has_subpath_to( 

vimsota$has_closed_path(s_create,l,a_int), 
2,a_array.l). 
vsr${equivs:ernn$[], 

closures:tnsa$[ts_ent${d:l,r:sa$[a_1nt]}], 

transitions: 

tntano$[tn_ent${d:l,r:tano$[ta_ent${d:a_int.r:noa}]}, 

tn_ent${d:2.r:tano$[ta_ent${d:a_array,r:nol}]}]}, 
falsa) 



X 


HI • ARRAY [N2] 


X 


N3 • ARRAY [N4] 


X 


HI • N3 


X 


N2 • INT 


X 


should work when all closed 


tmpl 


:vimsota:» vimsota$close( 



vimsota$closa( 

vimsota$has_subpath_to( 

i/imsota$has_subpath_to(s_create,l.a_array,2). 
3,a_array,4). 
3,sa$[a_array]), 
l,sa$[a_array]) 
tmpl:« vimsota$has_c1osed_path{vimsota$equate(trapl, 1,3), 

2,a_1nt) 

sexpect( "N1-A[N2] . N3«A[N4] ,N1«N3 .N2-INT" , 
tmpl, 
vsr${equivs:ernn$[snn$[1.3],snn$[2,4]], 

closures :tnsa$[ts_ent${d:l,r:sa$[a_array]}, 
ts_ent${d : 2 , r : sa$[a_1nt]}, 
ts_ent${d:3.r:sa$[a_array]}, 
ts_ent${d:4.r:sa$[a_1nt]}]. 
transitions: 

tntano$[ tn_ent${d : 1 . r : tano$[ ta_ent${d : a_arr ay , r : no2}]} , 
tn_ent${d : 2 , r : tano$[ ta_ent${d : a_int , r : noa}]} , 
tn_ent${d : 3 , r : tano$[ ta_ent${d : a_ar ray . r : no4}]} , 
tn_ent${d : 4 . r : tano$[ ta_ent${d : a_1nt , r : noa}]}]} . 
trut) 

X Nl • ARRAY[N2] 

X N3 • ARRAY [N4] 

X N2 • INT 

X N4 * STRING 

X Nl ' N3 

X can't build ft, don't even bother with the closures 

tmp:v1msota:» 

v1msota$has_path(vimsota$has_path( — 

vimsota$has_subpath_to( 

v1ntsota$has_subpath_to(s_create,l,a_array,2), 
3.a_array,4), 
2,«_int). 
4,a_string) 
X should work up to here 
begin 

vimsota$equate(tmp,l,3) 

stream$putl(stream$pr1mary_output(),"Can build exA, wrong") 
signal failure("Can build exA, wrong") 
•nd 
axcapt whan empty: 

streamSput1(stream$primary_output(), "Can't build exA, ok") 
•nd 

X Nl * CLOSED_STRUCT[A:N2,B:N3] 

X N2 ' INT 

X N3 * STRING 

X Nl ' OPEN_STRUCT[A:N4] 

X should work 

tmp : ■vimsotaShas_subpath_to( 
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vimsota$has_subpath_to( 

vimsota$close(s_create,l,sa$[a_geta,a_getb]), 
l.a_geta.2). 
l,a_getb,3) 
tmp:»vimsota$has_closed_path(tmp,2,a_int) 
tmp : « v imsota$has_cl osed_path( tmp , 3 , a_str ing) 
tmp : ■ v imsota$has_subpath_to( tmp , 1 , a_geta ,4) 
sexpect("Nl»CS[A:«Z.B:M3].NZ-INT,N3+S.Nl-0[a:n4]",tmp. 
vsr${equivs:ernn$[snn$[2,4]], 

closures :tnsa$[ts_ent${d:l.r:sa$[a_geta.a_getb]}, 
ts_ent${d:2.r:sa$[a_int]}, 
ts_ent${d : 3 , r : sa$[a_str ing]} . 
ts_ent${d:4.r:sa$[a_int]}], 
transitions: 
tn tano$[ tn_entS{d : 1 , r : tano$[ ta_en t${d : a_ge ta , r : no2} , 

ta_ent${d : a_getb . r : no3}]} , 
tn_ent${d : 2 , r : tano$[ ta_ent${d : a_int . r : no*}]} , 
tn_ent${d:3,r:tano$[ta_ent${d:a_string,r:noa}]}, 
tn_ent${d:4,r:tanoS[ta_ent${d:a_int,r:noa}]}]}, 
trut) 



X 

X Nl • CL0SED_SmuCT[A:HZ.B:N3] 

X Nl • 0PEN_STRUCT[C:N3] 

X does it not work because of closure violation 

tmp:*viflisota$close(vimsota$has_subpath_to( 

vimsota$has_subpath_to( s_creato , 1 , a_geta, 2 ) , 
l.a_getb,3). 
l,sa$[a_geta,a_getb]) 
begin 

vimsota$has_subpath_to(ttnp.l.a_getc,3) 
signal failure("Could build s_cab_pc, wrong") 
end 
•xcept when empty: stream$putl(stream$pr1mary_output(), 

"Couldnt buld s_cab_pc, ok") end 

X Nl • 0PEN_STRUCT[A:n2,B:H3] 

X NZ • INT 

X «4 « OPEN_STRUCT[ A tnS: b: n6] 

X n6 - STRING 

X Nl * N4 

X does It not work 

tmp:-vimsota$ha$_subpath_to(v1msota$has_subpath_to(s_create,l,a_geta,2), 

l,a_getb,3) 
tmp : -vimsota$has_closed_path( tmp , 2 , a_1nt) 
tmp : >vimsota$has_subpath_to( v imsota$has_subpath_to{ tmp , 1 , a_gata,6) , 

l.a_getb,6) 
tn>p:«vimsota$has_closed_path(vimsotaSequate(tnip.l > 4),6,a_str1ng) 
_14_trans : tano : -tano$[ta_ent${d : a_geta , r : no2} , 
ta_ent${d:a_getb,r:no3}] 
_25_trans:tano:«tano$[ta_ent${d:a_1nt,r:noa}] 
_36_trans:tano:"tano$[ta_ent${d:a_string,r:noa}] 
mytrans:tntano:* tntano$[tn_entS{d:l, r:_14_trans}, 

tn_ent${d:4. r:_14_trans}, 

tn_ent${d:2, r:_25_trans}, 

tn_ent${d:5, r:_2S_trans}, 

tn_entS{d:3. r:_36_trans}, 

tn_ent${d:6, r:_36_trans}] 

sexpect("Two-def ined struct unclosed", tmp, 

vsr${equ1vs:ernn$[snn$[2.6],snn$[3,6],snn$[l.4]], 
closures:tnsa$[ts_ent${d:2,r:saS[a_1nt]}, 
ts_ent${d:5,r:sa$[a_1nt]}, 
ts_ent${d:3,r:sa$[a_str1ng]}. 
ts_ent${d:6.r:sa$[a_str1ng]}]. 
transitions: my trans}, 
false) 
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sexpect("Two-def ined struct closed", 

vimsota$close( trap , 1 , sa$[a_geta, a_getb]) . 
vsr${equivs:ernn$[snn$[Z,5],snn$[3,6],snn$[l,4]], 
closures :tnsa$[ts_ent${d:Z,r:saS[a_1nt3}, 
ts_entS{d:5,r:sa$[a_1ntj}, 
ts_ent${d:3,r:sa$[a_str1ng]}, 
ts_ent${d : 6 , r : sa$[a_string]} , 
ts_ent${d:l,r:sa$[a_geta.a_getb]}, 
ts_ent${d : 4 , r : sa$[a_ge ta , a_getb]}] . 
trans it ions :my trans}, 
true) 

X that pretty well tests the closure with equates 

X now for some recursion 

X Nl • ARRAY[N1] 

X does it work? 

sexpect("Nl«A[M]", 

vimsota$close{vimsota$has_subpath_to(s_create,l,a_array,l), 

l,sa$[a_array]), 
vsr${equivs:ernn$[], 

closures :tnsa$[ts_ent${d:l,r:sa$[a_array]}], 
transitions: 

tntano$[tn_ent${d:l, r:tano$[ta_ent${d:a_array,r:nol}]}]}. 
true) 

X Nl • ARRAY [N2] 
X H2 • ARRAY[N1] 

sexpect("Nl-A[NZ] MZ-A[N1]". 

vimsota$has_subpath_to( 

virasota$has_subpath_to( 
vimsota$close( 

vimsota$close(s_create,Z,sa$[a_array]), 
l,sa$[a_array]). 
l,a_array,Z), Z,a_array,l), 
vsr${equivs:ernn$[], 

closures :tnsa$[ts_ent${d: 1 , r : sa$[a_array]} , 

ts_ent${d:Z.r:sa$[a_array]}j, 
transitions: 

tntano$[ tn_ent${d : 1 , r : tano$[ta_ent${d : a_ar ray , r : noZ}]} , 
tn_entJ{d:Z, r:tano$[ta_entS{d:a_array,r:nol}]}]}, 
true) 

X Nl • closed_STRUCT[a:N2. b:N3] 
X N2 ' open_STRUCT[a:M2, b:H4] 
X H4 - Ml 
X N2 ' N3 
X N3 - H4 

X everything should come out to be the same thing 

tmp:-v1msota$close(vimsota$has_subpath_to( 

v1msotaShas_subpath_to(s_create,l,a_geta,2). 
l,a_getb,3), 
1 , sa$[a_geta , a_getb]) 
tmp:-v1msota$has_subpath_to(vimsota$has_subpath_to(tmp,2,a_geta,2), 

Z,a_getb,4) 
tmp:-vimsota$equate(vimsota$equate(v1msota$equate(tmp,l,4),2,3),3,4) 
_trans: tano:«tano$[ta_ent${d:a_geta.r:nol}, 
ta_ent${d:a_getb,r:no2}] 
_c1ose:sa:*sa$[a_geta,a_getb] 
sexpectCEx C.tmp, 

vsrS{equ1vs:ernn$[snn$[l,Z,3,4]], 

closures :tnsa$[ts_ent${d:l,r:_close}, 
ts_ent${d:Z,r:_close), 
ts_ent${d : 3 , r :_dose) , 
ts_ent${d:4,r:_close}], 
trans It ions :tntano$[tn_ent${d:l,r:_trans}, 
tn_ent${d : Z , r :_trans} , 
tn_ent${d:3,r:_trans}, 
tn_ent${d:4,r:_trans}]}, 
true) 
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X Nl • CL0SED_STRUCT[a:N2. b:N3] 

X H2 • CLOSED_STRUCT[a:Nl, c:H3] 

X N3 • IhT 

X does it work? 

tmp:»vimsota$has_closed_path(s_create.3,a_int) 

tmp: a vimsota$close(vimsota$has_subpath_to( 

vimsota$has_subpa.th_ito( tmp . 1 , a_geta, 2) , 
l.a_getb,3), 
l.sa$[a_geta.a_getb]) 
tmp:"vimsota$close(vimsota$has_subpath_to( 

v insota$has_subpath_to( tmp , 2 , a_geta , 1) , 
2,a_getc,3). 
2,sa$[a_geta.a_getc]) 
sexpect( 

"Ex D",tmp, 
vsr${equivs:ernn$[], 

closures:tnsa$[ts_ent${d:l,r:sa$[a_geta,a_getb]}, 
ts_entS{d:2,r:sa$[a_geta,a_getc]}, 
ts_ent${d:3,r:sa$[a_1nt]}], 
transitions :tntano$[ 

tn_ent${d:l,r:tano$[ta_ent${d:a_geta,r:no2}, 

ta_ent${d : a_getb , r : no3}]} , 
tn_ent${d : 2 , r : tano$[ ta_ent${d : a_g«ta , r : nol} , 

ta_ent${d:a_getc.r:no3}]}, 
tn_ent${d:3,r:tano$[ta_ent${d:a_int.r:noa}]}]}, 
trut) 



sexpect("C1osure, but not all there", 

vimsota$close(vimsota$has_subpath_to( 

vimsota$has_closed_path(s_create,2,a_1nt), 
l,a_geta,2), 
l,sa$[a_geta,a_getb]), 
vsr${equivs:ernn$[], 

closures :tnsa$[ts_ent${d:l,r:sa$[a_gata,a_getb]}, 

ts_ent${d:2,r:sa$[a_1nt]}]. 
transitions: 

tntano$[tn_ent${d:l,r:tano$[ta_ent${d:a_geta,r:no2}]}, 
tn_ent${d : 2 , r : tano$[ta_ent${d : a_1nt , r : noa}]}]} , 
false) 

X check for class error 

tmp : ■virasota$has_subpath_to( s.create , 1 , a_geta ,2) 

begin 

tmp:»vimsotaShas_subpath_to(tmp,l,a_array,2) 

signal fa1lure("class error 1 not caught") 

end except when empty: stream$putl(stream$pr1mary_output(), 

"class error 1 caught ok") 
end 

tmp:«v1msota$has_path(s_create,l,a_1nt) 
begin 

tmp : -vimsota$has_path( tmp , 1 , a_s tr ing) 

signal failure("class error 2 not caught") 

end except when empty: stream$putl(stream$pr1mary_output(), 

"class error 2 caught ok") 
end 

X check for path with non-terminator error 
begin 

tmp : »v imsota$has_path( s_create , 1 , a_ar ray) 

signal failure("has_path with non terminator not caught") 

end except when non_terminator: 

stream$putl(stream$pr1mary_output(), 

"has_path with non terminator caught ok") 
end 
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X check for subpath_to with terminator error 
begin 

tmp: s vimsota$has_subpath_to(s_create,l,a_1nt,2) 

signal failure("has_subpath_to with terminator not caught") 

end except when terminator: 

stream$putl(stream$pr1mary_output(). 

"has_subpath_to with terminator caught ok") 
. end 

end sotatest 

X if the rep of the mysota is not equal to expected_rep then prints an 
X error, other wise prints "ok" 

sexpect-proc( name: string, mysota :vimsota, expected_rep:vimsotarep,guta:boo1) 
own po: stream: ■stream$primary_output() 
died: boot: 'false 

exp:vimsotarep:*vimsota$export(mysota) 
streamSputs(po.nanie) 
If exp.equivs"expected_rep. equivs then 
stream$puts(po." equivs ok,") 
else 

stream$puts(po," equivs broken,") 
died:«true 
end 
If exp.closures»expected_rep. closures then 
stream$puts(po," closures ok,") 
else 

stream$puts(po," closures broken,") 
died:-true 
end 
X nave to do the mapping test badly, sigh, this is because 
X 1 am really modeling (nodename,alphabet)->nodename, but 
X ended up using nodename->(nodename->alphabet) 
trandiedrbool : "false 
begin 

for tn:nodename, ta:tano 1n tntano$entr1es(exp. transitions) do 
for tstalphabet, tno:no 1n tano$entries(ta) do 
etn: no : a expected_rep. trans itions[tn][ts] 
tagcase tno 

tag acceptor: If etn~«tno then exit bad_map end 
tag node(num:1nt): 

1f >set[1nt]$Element0f(no$value_node(etn), 

exp.equ1vs[num]) 
then exit bad_raap end 
end 
end 
end 
for tn:nodename, ta:tano 1n tntano$entr1es(expected_rep. transitions) do 
for ts:alphabet, tno:no 1n tanoSentries(ta) do 
etn: no :»exp. trans 1tions[tn][ts] 
tagcase tno 

tag acceptor: 1f etn~-tno then exit bad_map end 
tag node(num:1nt): 

If ~set[1nt]$Element0f(no$value_node(etn), 

expected_rep.equ1vs[num]) 
then exit bad_map end 
•nd 
end 
end 
end 
except when undefined, bad_map,wrong_type: tr an died: "true d1ed:«true end 

if trandied than streara$puts(po," transitions broken,") 
else streamSputs(po," transitions ok,") end 

begin 

vimsota$get_unique_type_assignment (mysota) 
1f guta then stream$putl(po," guta defined ok") 
else 
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streamSputl (po, " expected guta ambiguity, it wasn't") 
died:=true 
end 
end 
except when ambiguous: 

If guta then 

streamSputl (po, " but expect guta defined, it wasn't") 

died : 'true 

else 

streamSputl (po , " guta ambiguous ok") 
end 
end 
1f died then signal f ailure( "died — ") end 
end sexpect 
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